A comprehensive nonredundant expressed sequence tag collection for the developing Rattus norvegicus heart
Jennifer J.S. Laffin1,
Todd E. Scheetz2,
Maria de Fatima Bonaldo1,
Rebecca S. Reiter3,
Shereen Chang1,
Mari Eyestone1,
Hakeem Abdulkawy2,
Bartley Brown2,
Chad Roberts2,
Dylan Tack2,
Tamara Kucaba1,
Jim Jung-Ching Lin3,
Val C. Sheffield1,
Thomas L. Casavant2 and
M. Bento Soares1,4
1 Department of Pediatrics and Interdepartmental-Genetics Graduate Program, The University of Iowa, Iowa City, Iowa 52242
2 Department of Engineering, The University of Iowa, Iowa City, Iowa 52242
3 Department of Biological Sciences, The University of Iowa, Iowa City, Iowa 52242
4 Departments of Biochemistry, Physiology, and Biophysics, and Orthopaedics, The University of Iowa, Iowa City, Iowa 52242
 |
ABSTRACT
|
---|
Congenital heart defects affect
1,000,000 people in the United States, with 40,000 new births contributing to that number every year. A large percentage of these defects can be attributed to septal defects. We assembled a nonredundant collection of over 12,000 expressed sequence tags (ESTs) from a total of 30,000 ESTs, with the ultimate goal of identifying spatially and/or temporally regulated genes during heart septation. These ESTs were compiled from nonnormalized, normalized, and serially subtracted cDNA libraries derived from two sets of tissue samples. The first includes microdissected rat hearts from embryonic (E) days E13, E15, and E16.5E18.5 and adult heart. The second includes hearts from embryonic days E17, E19, and E21 and postnatal (P) days P1, P12, P74, and P200. Over 6,000 novel ESTs were identified in the libraries derived from these two sets of tissues, all of which have been contributed to the NCBI rat UniGene collection. It is anticipated that such EST and cDNA clone resources will prove invaluable to gene expression studies aimed at the understanding of the molecular mechanisms underlying heart septation defects.
cDNA library; gene expression; cardiovascular development
 |
INTRODUCTION
|
---|
CONGENITAL CARDIOVASCULAR defects affect
1,000,000 Americans, with 40,000 affected babies born each year (3). There are over 35 recognized types of congenital heart defect, the leading cause of death due to heart disease (3). The most common form of congenital heart defect is ventricular septal defect, accounting for 1417%, with atrioventricular septal defects contributing another 410% (3). The identification of spatially and/or temporally regulated genes in the developing heart is central to the understanding of pathways controlling normal heart development. Perturbations of these pathways are likely to result in a congenital heart defect. Unraveling the molecular mechanisms underlying congenital heart diseases might enable the development of strategies for early diagnosis and ultimately for prevention of these defects.
Heart development requires intricate coordination of temporally and spatially regulated gene expression. In the rat, heart septation occurs at approximately embryonic (E) days E1315, resulting in the formation of two atrial and ventricular chambers and four valves. It is at this time when many of the congenital heart defects become apparent. Current methods for large-scale gene discovery and parallel analysis of gene expression can provide a plethora of information on the molecular basis of heart development, stress response, and disease in a relatively short amount of time. For example, Hwang et al. (16) reported on the generation of 43,285 expressed sequence tags (ESTs) from human heart cDNA libraries for the identification of hypertrophy related genes, and Megy et al. (19) identified 35 heart-specific genes from 4,303 ESTs through the analysis of human EST start libraries. These studies were done within the context of the rat gene discovery program at the University of Iowa (21) and demonstrate the power of the EST approach for rapid identification of genes expressed in the cardiovascular system. Although the rat is one of the leading model organisms in cardiovascular physiology (8, 17), a comprehensive set of rat cDNAs and ESTs for the study of genes related to cardiovascular diseases is not yet available. Here we report the development of a nonredundant collection of 12,933 rat heart ESTs from nonnormalized, normalized, and serially subtracted cDNA libraries derived from two samples sets. The first set includes microdissected rat hearts from embryonic (E) days E13, E15, and E16.5E18.5 and adult heart. These three embryonic time points correspond to before, during, and after heart septation, the milestone in which most congenital heart diseases occur. The microdissection into atria, ventricles, and atrioventricular canal and outflow provides more concentrated populations of specialized cells for the identification of novel spatially regulated genes. The second set includes hearts from embryonic days E17, E19, and E21 and postnatal (P) days P1, P12, P74, and P200. This second set allows us to identify novel transcripts from later prenatal stages of development as well as postnatal stages. Generation of ESTs from serially subtracted and normalized cDNA libraries enabled identification of mRNAs representing a wide range of expression, including rare mRNAs likely to be missed in conventional EST discovery projects that rely on random sequencing of cDNAs from standard nonnormalized libraries. We anticipate that this collection will prove invaluable to subsequent studies aimed at the identification of temporally and/or spatially regulated genes in the developing heart. It is noteworthy, however, that given that every effort was taken to minimize redundant identification of ESTs throughout the project, the resulting EST data do not allow for quantitative determination of transcript abundance.
 |
MATERIALS AND METHODS
|
---|
Embryonic rats.
The first set comprised three tissue groups from days E13, E15, and E16.5E18.5 obtained from outbred, Sprague-Dawley rats. The E13 primordial ventricle was dissected away from the outflow and the atrioventricular canal. Atria and approximately two-thirds to one-half of the lower portion of the ventricle were dissected from E15 and E16.5E18.5 hearts. The remainder of the tissue, which contains a mixture of outflow tract and the atrioventricular region embedded in ventricular tissue, was designated as atrioventricular canal and outflow. The harvested tissue was flash frozen in liquid nitrogen and stored at 80°C. A second set of whole hearts from embryonic days E17, E19, and E21 and postnatal days P1, P12, P74, and P200 were harvested, flash frozen in liquid nitrogen, and stored at 80°C.
Heart libraries.
Total RNA was isolated from rat heart tissue samples by acid guanidinium thiocyanate and phenol:chloroform as described by Chomczynski and Sacchi in 1987 (11) and modified by Chomczynski and Mackey in 1995 (10). RNA quality was determined by 1.2% formaldehyde-agarose gel electrophoresis. Amount of RNA was determined by spectrophotometry. Poly(A)+ RNA was isolated by oligo (dT)-cellulose (New England Biolabs) chromatography according to the protocol described by Ausubel et al. in 1995 (5). cDNA was synthesized using an oligo (dT) primer containing a NotI restriction site and a unique library tag [5'-GCGGCCGC-library tag-oligo (dT)2022 3'] (14). Library tags are as follows: embryonic days E16.5E18.5 atrioventricular canal, 5'-GAACC-3'; days E16.5E18.5 atrium, 5'-GATTC-3'; days E16.5E18.5 ventricle, 5'-GTTCG-3'; day E15 atrioventricular canal, 5'-GAAGG-3'; day E15 atrium, 5'-GAAGC-3'; day E15 ventricle, 5'-GGTGTC-3'; and day E13 ventricle, 5'-ACAAC-3'. At the 5' end of the cDNA an EcoRI linker was ligated for cloning (5'-EcoRI- GGCACGAGG-3'). cDNAs were directionally clone into a phagemid vector (pT7T3-Pac) digested with NotI and EcoRI according to previously described protocols (9, 23). Each library was normalized individually as described by Bonaldo et al. (9), and subsequently pooled for subtractions (Scheetz et al., unpublished observations). The first subtraction was generated using, as tracer, a pool of the normalized embryonic heart libraries and a normalized adult whole heart library, and as driver, a pool comprising ribosomal clones (4%), 7,000 arrayed and sequenced clones from start libraries (1,000 sequences each library; 26%), 7,000 arrayed and sequenced clones from the normalized libraries (1,000 sequences each; 40%), and 5,000 arrayed and sequenced clones from a pool of normalized and nonnormalized libraries (30%). The second subtraction used the first subtracted library as a tracer and a driver comprised of 4,800 arrayed and sequenced clones from the first subtracted library. The third subtracted library was also derived from the first subtracted library upon hybridization with a driver containing the following: 13,000 arrayed and sequenced clones from the second subtracted library (20%), the 16 most prevalent clones identified in the second subtracted library (10%), 5,000 arrayed and sequenced clones from the first subtracted library (20%), and 5,472 re-arrayed and sequenced clones representing the nonredundant set of heart ESTs that we had identified up until then (50%). Two other libraries were generated using a second set of hearts harvested from embryonic days E17, E9, and E21 and postnatal days P1, P12, P74, and P200. These were pooled for construction of a cDNA library using the library tag 5'-ATAAGATAAC-3'. This was followed by library normalization as described above. A brief schema of library construction is shown in Fig. 1, and a description of the source of each of the libraries can be found in Table 1. The purpose of developing such a comprehensive set of nonnormalized, normalized, and serially subtracted heart cDNA libraries was to maximize both the efficiency and the cost-effectiveness of our heart EST discovery project, while increasing our chances of identifying ESTs derived from rare transcripts. Such ESTs are less likely to be sampled in conventional large-scale gene discovery programs that rely on the production of ESTs from randomly picked clones from standard, nonnormalized cDNA libraries.
EST discovery rates for start, normalized, and subtracted libraries were determined by clustering all ESTs for which a restriction site with upstream vector and 100 bp of high-quality sequence was found computationally (Scheetz et al., unpublished observation). Initially,
1,000 ESTs were selected from each library. Clustering analysis was then performed for assessment of library quality and EST novelty, based upon which decisions were made for further arraying and sequencing. Typically, several thousand additional clones were arrayed from the subtracted libraries since they exhibited the highest novelty rates. EST discovery rates were assessed not only for each library within itself, but also for each library respective to the entire collection of rat heart ESTs available at the time.
Sequencing, analysis, and classification.
Sequencing was performed using ABI 3700 or ABI 377 sequencers. Arrayed cDNA clones were sequenced from both the 5' and 3' ends using M13 reverse (5'-AGCGGATAACAATTTCACACAGGA-3') and forward primers (5'-GTTTTCCCAGTCAC-3'). The EST sequences were processed using the sequence-processing pipeline developed at the University of Iowas Center for Bioinformatics and Computational Biology (Scheetz and Casavant, unpublished observations). Base calling and per-base quality values were generated using Phred (12, 13). "ESTprep" was used to detect the primary EST features: cloning vector, restriction site, tissue tag, poly(A) tail, and polyadenylation signals (22). The nine most prevalent alternative polyadenylation signals (6) were evaluated, along with the two canonical forms (AAUAAA and AUUAAA). For those sequences that satisfied the quality criteria imposed by ESTprep, RepeatMasker (S. Green, unpublished data) was used to identify contaminating sequence (vector, mitochondrial, and bacterial) and to mask known rodent repetitive elements as identified in Repbase Update (18).
UIcluster (25) groups ESTs into clusters by sequence similarity. Consequently, all ESTs derived from the same transcript unit will fall into single clusters. Representative sequences were selected from each cluster based on sequence quality and on presence of the correct tissue tag, polyadenylation signal sequence, and poly(A) tail. Comparison against the National Center for Biotechnology Information (NCBI) UniGene clustering reveals an approximate 5% difference in clusters (T. E. Scheetz, L. J., B. Berger, S. Mackerly, S. A. Baumes, R. I. I., Brown, S. Chang, J. Coco, J. Conklin, K. Crouch, M. Donohue, G. Doonan, C. Estes, M. Eyestone, K. Fishler, J. Gardiner, L. Guo, B. Johnson, C. Keppel, R. Kreger, M. Lebeck, R. Marcelino, V. Miljkovich, M. Perdue, L. Qui, J. Rehmann, R. S. Reiter, B. Rhoads, K. Schaefer, C. Smith, I. Sunjevaric, K. Trout, N. Wu, C. L. Birkett, J. Bischof, B. Gackle, A. Gavin, B. Mokrzycki, C. Moressi, B. OLeary, K. Pedretti, C. Roberts, M. Smith, D. Tack, N. Trivedi, T. Kucaba, T. Freeman, J. Lin, M. F. Bonaldo, T. L. Casavant, V. C. Sheffield, and M. B. Soares, unpublished data).
Nucleic acid and protein BLAST-based analyses (2) were performed for putative function identification of all identified ESTs. Threshold E-values of e-40 for nucleic acid and e-20 for protein homologies were used for annotation. Putative identification and GenBank accession number of the best observed BLAST hit were selected for each EST. Gene name and product information, when available, were collected directly from NCBI web site (http://www.ncbi.nlm.nih.gov/). Putative functions were then organized into categories (molecular function, biological process, and cellular component) conforming to the Gene Ontology (GO) Consortium (4). Swiss-Prot keywords, where identified, were translated into GO terms. Information on the GO category was extracted for each sequence with available product information. A summary chart of these data can be found in Figs. 3 and 4.

View larger version (22K):
[in this window]
[in a new window]
|
Fig. 3. Gene Ontology. These charts provide a breakdown the Gene Ontology annotations available for a select few of the sequences in the nonredundant set. Top refers to annotations under "biological processes." Directly below are the three main categories within biological processes broken down. Bottom includes the data found in the "cellular component" category. The numbers associated with each chart represent the number of expressed sequence tags (ESTs) in that category.
|
|

View larger version (29K):
[in this window]
[in a new window]
|
Fig. 4. Gene ontology. These charts provide a breakdown of the Gene Ontology annotations available for the "molecular function" category and its subcategories.
|
|
 |
RESULTS
|
---|
Twenty libraries were successfully generated from eight rat heart tissue samples (Table 1). Typically, cDNA inserts range in size from 0.35 to 2.5 kb, with an average length of
1 kb. A total of 31,175 sequences were generated, and 12,933 unique clusters were identified. The overall novelty throughout the sequencing phase remained above 40%, with an increased rate being observed after each cycle of subtraction Fig. 2. Most notably, sequencing from the first two subtracted libraries increased novelty
40%. The increases in novelty ratios were consistently seen following a cycle of normalization or subtraction as evidenced in Fig. 2. Table 2 provides a description of the number of sequences contributed and the respective cluster representations of the tissue regions and time points from embryonic rats set 1.

View larger version (31K):
[in this window]
[in a new window]
|
Fig. 2. Incremental novelty. Contributions from each library are as follows: sequences 14,742, AA0AG0; sequences 4,7435,511, CM0; sequences 5,51211,317, AA1AG1; sequences 11,31815,937, BJ0p; sequences 15,93824,616, BJ1; sequences 24,61730,000, BJ2; sequences 30,00130,826, A0C4 (by library tag); sequences 30,82731,616, CS0s; and sequences 31,61733,100, CS0.
|
|
View this table:
[in this window]
[in a new window]
|
Table 2. Dissected library (AA0AG0, AA1AG1, and BJ0pBJ2) coverage breakdown by number of sequences and clusters covered by each region and time point
|
|
The BLAST (1) annotation of the final nonredundant set of ESTs showed that 4,235 ESTs had a significant hit to the NCBI nonredundant nucleotide database (an e-value of
e40). A total of 5,449 ESTs had significant hits to the mouse EST database, and 3,365 ESTs had significant hits to the human EST database. Figures 3 and 4 provide a breakdown of the 1,279 ESTs with available Gene Ontology terms for each of the three main categories.
Cluster analysis revealed a total of 2,828 "heart-only" rat UniGene clusters, clusters containing only ESTs derived from heart tissue. Of the singleton clusters (single EST clusters), 18 are associated with a known gene, 56 are highly similar to (>90% identity) a known gene or to a hypothetical protein, 93 are moderately similar to (7090% identity) a known gene or to a hypothetical protein, 104 are weakly similar to (<70% identity) a known gene or to a hypothetical protein, and 1,935 are unannotated ESTs. There are 367 clusters containing two or more ESTs each. Of these, 5 are associated with a known gene, 32 are highly similar to a known gene, 25 are moderately similar to a known gene or to a hypothetical protein, 50 are weakly similar to a known gene or to a hypothetical protein, whereas all others have no annotation. Of the 255 remaining ESTs without annotation, 1 cluster contains 12 members, 1 cluster contains 10 members, 9 clusters contain 5 members, 14 clusters contain 4 members, 55 clusters contain 3 members, and 175 clusters contain 2 members. Of the known genes identified, annotation included genes identified in a variety of species. It is noteworthy that included in the set of heart-specific clusters are known heart-specific genes, such as aortic smooth muscle
-actin 2, ventricular myosin light chain 2, myosin regulator light chain 1 (atrial isoform), I38344 titin (cardiac muscle), ryanodine receptor 2 (calcium release channel isoform 2, cardiac), cytochrome c oxidase polypeptide VIII-heart, myosin heavy chain [cardiac muscle ß-isoform (MyHC-ß)], and calsequestrin-cardiac isoform.
A complete list of all of the members of the nonredundant set including annotation derived from BLAST is available as Supplemental Material, from the Physiological Genomics web site.1
 |
DISCUSSION
|
---|
The EST approach to gene discovery has proven invaluable to many fields (7, 24, 26, 27). Here we described the identification of a comprehensive nonredundant collection of rat heart ESTs from nonnormalized, normalized, and serially subtracted cDNA libraries constructed from microdissected embryonic heart tissue. The process of serial subtraction enabled discovery of novel ESTs at unprecedented efficiency. BLAST annotation revealed that our set contains genes known to be important in development and in disease pathology such as: Anf, Erbb2, Neurotrophin 3, Jagged-1, Nkx2.5, eHand, Irx4, Gata1, Gata4, Gata6, ß-myosin heavy chain, Nfatc, Pitx3,
B-crystallin,
-actin, ß-actin, Vegfr2, Endoglin, TGFbeta 2, myosin light chain, Smad6, Troponins I, T, and C,
-tropomyosin, Frizzled 2, and many others. Hence, we anticipate that the cDNA clones and ESTs that we have compiled will prove invaluable for the identification of genes and pathways underlying normal heart development and for cardiovascular research at large.
BLAST analysis of clusters in the nonnormalized adult whole heart library, a superficial assessment of relative expression, indicated that the most abundant corresponded to the genes for: 18S, 5.8S, and 28S rRNAs,
-cardiac myosin heavy chain, glyceraldehyde-3-phosphate dehydrogenase,
-cardiac actin, atrial myosin light chain 1, elongation factor 1-
, ventricular myosin light chain 1, and atrial natriuretic factor. Cluster analysis of the heart regions indicated that these 10 transcripts are also the most prevalent in all regions and time points. In addition, ribosomal protein S2 was also found among the most abundant transcripts in the embryonic day E15 microdissected regions.
Table 2 demonstrates the even coverage across tissue samples accomplished in the construction of the set. Interestingly, there is a significant contribution from the smaller atrioventricular canal and outflow libraries, which points to the advantage of using libraries derived from microdissected tissues for EST discovery.
The Gene Ontology description in Figs. 3 and 4 divides all functions of each available EST into three major groups: "biological processes" (subclassified into: cellular process, development, behavior, and physiological processes), "cellular component" (extracellular, cellular and unlocalized), and "molecular function" (protein tagging, transcription regulator, defense/immunity protein, binding, structural molecule, transporter, enzyme, enzyme regulator, translation regulator, apoptosis, signal transducer, and chaperone activities). The largest categories indicated in molecular function include binding activity (ability to interact with one or more molecules), enzyme activity, signal transducer activity, and transcription regulator activity. Characterization of genes involved in these subcategories will provide insight to the differences between processes pertaining to cardiac development vs. those involved in normal cellular function.
It should be emphasized that the value of this set is not limited to studies of heart septation, since the EST collection described here represents transcripts expressed throughout a wide developmental range. Last, but not least, it is noteworthy that we have identified a number of genes of yet unknown function. Future studies will determine their potential role in normal heart development.
 |
GRANTS
|
---|
This research was supported by National Institutes of Health Grant 2RO1-HL-59789.
 |
ACKNOWLEDGMENTS
|
---|
We acknowledge Da-Zhi Wang and Sonja Krob for assistance with embryo dissections, and we thank the sequencing efforts of the Oakdale team: Robert Brown, Jim Conklin, Keith Crouch, Micca Donohue, Greg Doonan, Katrina Fishler, Brad Johnson, Catherine Keppel, Rikki Kreger, Mark Lebeck, Mindee Perdue, Bridgette Rhoads, Kelly Schaefer, Christina Smith, and Kurtis Trout.
 |
FOOTNOTES
|
---|
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
Address for reprint requests and other correspondence: J. J. S. Laffin, The Univ. of Iowa, 4111 MERF, 375 Newton Road, Iowa City, IA 52242 (E-mail: jennifer-laffin{at}uiowa.edu).
10.1152/physiolgenomics.00186.2003.
1 The Supplementary Material for this article is available online at http://physiolgenomics.physiology.org/cgi/content/full/00186.2003/DC1. 
 |
REFERENCES
|
---|
- Altschul SF, Gish W, Miller W, Myers EW, and Lipman DJ. Basic local alignment search tool. J Mol Biol 215: 403410, 1990.[CrossRef][ISI][Medline]
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, and Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 33893402, 1997.[Abstract/Free Full Text]
- American Heart Association. 2002 Heart and Stroke Statistical Update. Dallas, TX: American Heart Association, 2001.
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, and Sherlock G. Gene Ontology: tool for the unification of biology. Gene Ontology Consortium. Nat genet 25: 2529, 2000.[CrossRef][ISI][Medline]
- Ausubel FM, Katagiri F, Mindrinos M, and Glazebrook J. Use of Arabidopsis thaliana defense-related mutants to dissect the plant response to pathogens. Proc Natl Acad Sci USA 92: 41894196, 1995.[Abstract]
- Beaudoing E, Frier S, Wyatt JR, Claverie JM, and Gautheret D. Patterns of variant polyadenylation signal usage in human genes. Genome Res 10: 10011010, 2000.[Abstract/Free Full Text]
- Boardman PE, Sanz-Ezquerro J, Overton IM, Burt DW, Bosch E, Fong WT, Tickle C, Brown WR, Wilson SA, and Hubbard SJ. A comprehensive collection of chicken cDNAs. Curr Biol 12: 19651969, 2002.[CrossRef][ISI][Medline]
- Boixel C, Fontaine V, Rucker-Martin C, Milliez P, Louedec L, Michel JB, Jacob MP, and Hatem SN. Fibrosis of the left atria during progression of heart failure is associated with increased matrix metalloproteinases in the rat. J Am Coll Cardiol 42: 336344, 2003.[CrossRef][ISI][Medline]
- Bonaldo MF, Lennon G, and Soares MB. Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res 6: 791806, 1996.[Abstract]
- Chomczynski P and Mackey K. Modification of the TRI reagent procedure for isolation of RNA from polysaccharide- and proteoglycan-rich sources. Biotechniques 19: 942945, 1995.[ISI][Medline]
- Chomczynski P and Sacchi N. Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal Biochem 162: 156159, 1987.[CrossRef][ISI][Medline]
- Ewing B and Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8: 186194, 1998.[Abstract/Free Full Text]
- Ewing B, Hillier L, Wendl MC, and Green P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8: 175185, 1998.[Abstract/Free Full Text]
- Gavin AJ, Scheetz TE, Roberts CA, OLeary B, Braun TA, Sheffield VC, Soares MB, Robinson JP, and Casavant TL. Pooled library tissue tags for EST-based gene discovery. Bioinformatics 18: 11621166, 2002.[Abstract/Free Full Text]
- Hwang JJ, Allen PD, Tseng GC, Lam CW, Fananapazir L, Dzau VJ, and Liew CC. Microarray gene expression profiles in dilated and hypertrophic cardiomyopathic end-stage heart failure. Physiol Genomics 10: 3144, 2002. First published April 30, 2002; 10.1152/physiolgenomics. 00122.2001.[Abstract/Free Full Text]
- Jacob F, Ariza P, and Osborn JW. Renal denervation chronically lowers arterial pressure independent of dietary sodium intake in normal rats. Am J Physiol Heart Circ Physiol 284: H2302H2310, 2003. First published February 27, 2003; 10.1152/ajpheart.01029.2002.[Abstract/Free Full Text]
- Jurka J. Repbase update: a database and an electronic journal of repetitive elements. Trends Genet 16: 418420, 2000.[CrossRef][ISI][Medline]
- Megy K, Audic S, and Claverie JM. Heart specific genes revealed by EST sampling. Genome Biol 3: PREPRINT0008, 2002.[Medline]
- Scheetz TE, Raymond MR, Nishimura DY, McClain A, Roberts C, Birkett C, Gardiner J, Zhang J, Butters N, Sun C, Kwitek-Black A, Jacob H, Casavant TL, Soares MB, and Sheffield VC. Generation of a high-density rat EST map. Genome Res 11: 497502, 2001.[Abstract/Free Full Text]
- Scheetz TE, Trivedi N, Roberts CA, Kucaba T, Berger B, Robinson NL, Birkett CL, Gavin AJ, OLeary B, Braun TA, Bonaldo MF, Robinson JP, Sheffield VC, Soares MB, and Casavant TL. ESTprep: preprocessing cDNA sequence reads. Bioinformatics 19: 13181324, 2003.[Abstract/Free Full Text]
- Soares MB, Bonaldo MF, Jelene P, Su L, Lawton L, and Efstratiadis A. Construction and characterization of a normalized cDNA library. Proc Natl Acad Sci USA 91: 92289232, 1994.[Abstract/Free Full Text]
- Sterky F, Regan S, Karlsson J, Hertzberg M, Rohde A, Holmberg A, Amini B, Bhalerao R, Larsson M, Villarroel R, Van Montagu M, Sandberg G, Olsson O, Teeri TT, Boerjan W, Gustafsson P, Uhlen M, Sundberg B, and Lundeberg J. Gene discovery in the wood-forming tissues of poplar: analysis of 5, 692 expressed sequence tags. Proc Natl Acad Sci USA 95: 1333013335, 1998.[Abstract/Free Full Text]
- Trivedi N, Bischof J, Davis S, Pedretti K, Scheetz TE, Braun TA, Roberts CA, Robinson NL, Sheffield VC, Soares MB, and Casavant TL. Parallel creation of non-redundant gene indices from partial mRNA transcripts. Fut Generation Comput Syst 18: 863870, 2002.[CrossRef][ISI]
- Vasmatzis G, Essand M, Brinkmann U, Lee B, and Pastan I. Discovery of three genes specifically expressed in human prostate by expressed sequence tag database analysis. Proc Natl Acad Sci USA 95: 300304, 1998.[Abstract/Free Full Text]
- Whitfield CW, Band MR, Bonaldo MF, Kumar CG, Liu L, Pardinas JR, Robertson HM, Soares MB, and Robinson GE. Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee. Genome Res 12: 555566, 2002.[Abstract/Free Full Text]