An ordered collection of expressed sequences from Cryphonectria parasitica and evidence of genomic microsynteny with Neurospora crassa and Magnaporthe grisea

Angus L. Dawe, Vanessa C. McMains, Maria Panglao{dagger}, Shin Kasahara{ddagger}, Baoshan Chen§ and Donald L. Nuss

Center for Biosystems Research, University of Maryland Biotechnology Institute, 5115 Plant Sciences Building, College Park, MD 20742, USA

Correspondence
Donald L. Nuss
nuss{at}umbi.umd.edu


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Cryphonectria parasitica, the causative agent of chestnut blight, has proven to be a tractable experimental system for studying fungal pathogenesis. Moreover, the development of infectious cDNA clones of C. parasitica hypoviruses, capable of attenuating fungal virulence, has provided the opportunity to examine molecular aspects of fungal plant pathogenesis in the context of biological control. In order to establish a genomic base for future studies of C. parasitica, the authors have analysed a collection of expressed sequences. A mixed cDNA library was prepared from RNA isolated from wild-type (virus-free) and hypovirus-infected C. parasitica strains. Plasmid DNA was recovered from individual transformants and sequenced from the 5' end of the insert. Contig analysis of the collected sequences revealed that they represented approximately 2200 individual ORFs. An assessment of functional diversity present in this collection was achieved by using the BLAST software utilities and the NCBI protein database. Candidate genes were identified with significant potential relevance to C. parasitica growth, development, pathogenesis and vegetative incompatibility. Additional investigations of a 12·9 kbp genomic region revealed microsynteny between C. parasitica and both Neurospora crassa and Magnaporthe grisea, two closely related fungi. These data represent the largest collection of sequence information currently available for C. parasitica and are now forming the basis of further studies using microarray analyses to determine global changes in transcription that occur in response to hypovirus infection.


Abbreviations: EST, expressed sequence tag; GO, gene ontology

The GenBank accession numbers for the sequences determined in this work are CB686454CB690670.

Tables of clones classified according to molecular function and biological process are available as supplementary data with the online version of this paper at http://mic.sgmjournals.org.

{dagger}Present address: Children's National Medical Center, Center for Genetic Medicine, 111 Michigan Avenue NW, Washington DC 20010, USA.

{ddagger}Present address: Department of Molecular Biology, Keio University School of Medicine, 144 : 8 Ogura, Saiwai, Kawasaki, Kanagawa 212-0054, Japan.

§Present address: Biotechnology Research Center, Guangxi University, 13 Xuiling Road, Nanning, Guangxi 530005, PR China.


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Genomic studies have provided a wealth of information for numerous organisms, including many fungi (reviewed by Bennett & Arnold, 2001). The recent completion of the Neurospora crassa genome (Galagan et al., 2003) has provided the fungal research community with a vast resource of data. This success is being supplemented by efforts to make more widely available sequence data prepared in the private sector for the economically important rice pathogen Magnaporthe grisea (www-genome.wi.mit.edu/annotation/fungi/magnaporthe/) and the model fungus Aspergillus nidulans (www-genome.wi.mit.edu/annotation/fungi/aspergillus/index.html). As an alternative to sequencing a genome, analyses of expressed sequence tags (ESTs) have also provided a large quantity of gene identification information (reviewed by Skinner et al., 2001). Recent descriptions of EST sequencing projects from phytopathogenic fungi include Phytophthora infestans (Kamoun et al., 1999), Mycosphaerella graminicola (Keon et al., 2000), Blumeria graminis (Thomas et al., 2001) and Magnaporthe grisea (Kim et al., 2001). This approach has the benefit of producing substantial quantities of data from organisms that may not be the focus of larger-scale genome sequencing ventures. The Internet has provided considerable utility as a universal method to disseminate such information, with many web sites such as COGEME (cogeme.ex.ac.uk: Soanes et al., 2002) that provide search-and-retrieve access to sequence information from one or several organisms.

The focus of our studies, the chestnut blight fungus Cryphonectria parasitica, was accidentally introduced into the United States in the early part of the twentieth century (Merkel, 1906) and has been responsible for the destruction of the native population of the American chestnut throughout almost the entire natural range of this hardwood species. C. parasitica is a Sordariomycete, a classification of ascomycetes that includes many phytopathogenic fungi, including the genera Ophiostoma (containing the species responsible for Dutch elm disease) and Magnaporthe. Further interest in C. parasitica stems from the observation that natural populations of the fungus can contain virulence-attenuating dsRNA elements, or hypoviruses (Anagnostakis & Day, 1979). The application of molecular biology techniques enabled the development of rigorously tested hypovirus and C. parasitica reverse genetics systems (reviewed by Dawe & Nuss, 2001). Subsequent studies led to the isolation and characterization of a number of genes important for virulence, including components of G-protein signalling pathways (Choi et al., 1995; Kasahara & Nuss, 1997; Kasahara et al., 2000; Parsley et al., 2003). Consistent with the prediction that hypoviruses alter host phenotype by modulating cellular signal transduction pathways, differential mRNA display (Chen et al., 1996) indicated that infection with hypovirus CHV1-EP713 results in a large change in the expression profile. While informative, differential display analyses do not readily reveal which host genes are differentially expressed. Cloning and investigation of the C. parasitica gene termed 13-1 has allowed the construction of a promoter-based reporter system with which to further analyse viral determinants that influence cellular signal transduction pathways (Parsley et al., 2002). However, such studies are also limited due to the use of a single-gene readout to monitor pathway activity, which may not reflect larger, global, changes occurring through the whole transcriptome.

As part of our continuing efforts to understand the interactions of hypoviruses and their host, C. parasitica, and the factors that affect fungal virulence, we have undertaken a sequencing project to generate a collection of expressed sequences. At the time of writing, only about 40 C. parasitica genes were represented in the NCBI databases. In the absence of a large-scale sequencing effort, we describe below the identification of approximately 2200 new genes using a moderate-throughput approach. By comparison of the recovered sequences to the publicly available information from NCBI, we demonstrate that these ESTs relate to a wide variety of genes with known roles in other organisms and thus provide for preliminary functional assignment of 1255 genes. Further, we explore a small region of genomic microsynteny that appears conserved across three genera and may have functional relevance to G-protein signalling. We anticipate that this large increase in available sequence data for C. parasitica will prove of considerable utility to the phytopathogenic research community while also enabling future studies that will take advantage of new microarray technologies to examine large-scale transcriptional alterations under a variety of conditions.


   METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Fungal strains and growth conditions.
Cryphonectria parasitica strain EP155 (ATCC 38755) and strain EP155 infected with the prototypic severe hypovirus CHV1-EP713 [designated EP713 (ATCC 52571)] were used as representatives of wild-type and virus-infected strains, respectively. Both strains were maintained on solid medium [3·9 %, w/v, Difco potato dextrose agar (Becton Dickinson)]. Preparative cultures for RNA recovery were grown for 7 days at room temperature under ambient light with cellophane on the agar surface to permit mycelial harvest.

RNA extraction and library construction.
Single-stranded RNA was prepared from solid-medium-grown EP155 and virus-infected cultures according to Chen et al. (1996). Poly(A)+ mRNA was isolated with oligo-d(T) cellulose (Gibco-BRL). Double-stranded cDNA was synthesized and SalI adapters added, then equal amounts of the cDNA from both EP155 and virus-infected EP155 were mixed prior to cloning using the SuperScript Lambda system (Gibco-BRL/Invitrogen) according to the manufacturer's instructions. The in vitro packaging of the resulting phage constructs was accomplished with the Gigapack III procedure (Stratagene) and Escherichia coli strain Y1090 (ZL) for phage propagation and subsequent storage. Finally, the prepared phage were used to infect E. coli DH10B(Zip) (Gibco-BRL/Invitrogen), which permitted plasmid excision resulting in colonies containing cDNA constructs representing both mRNA populations. Individual colonies were transferred into 96-well microtitre plates for cataloguing and storage. Preliminary insert size verification was performed on 20 randomly selected colonies by miniprep and restriction digestion with NotI/SalI. For subsequent experiments not included in this paper, all the clones that provided good sequence data were PCR amplified using T7 and SP6 primers, permitting a larger sampling of insert size data. An additional 120 randomly selected clones were sized in this manner.

cDNA sequence analyses.
Plasmids were prepared from cultures of stored colonies using the Qiaprep 8 system (Qiagen) with a vacuum manifold. Each preparation was checked for yield by electrophoresis, and then submitted to the DNA Sequencing Core Facility at the Center for Biosystems Research for analysis on ABI 3100 or 377 (Applied Biosystems) machines. Single sequence reads in one direction from the 5' end of the cloned fragment were obtained using the M13 reverse primer. Results files were scanned using the SeqMan unit of DNAStar running on a Macintosh G4 computer for the presence of known vector sequence and quality of data so that erroneous results (such as those plasmids with no insert, or multiple templates per reaction) could be removed. Short sequences (less than 100 bp) were also eliminated at this time. Following cleanup, the sequences were assembled into contigs using SeqMan and minimum match percentage of 80 % across at least 50 bp. The resulting contigs combined the similar sequences of duplicate clones or different clones of the same original cDNA product, thus providing an approximation of the number of individual sequences represented in the dataset. Analysis of the total sequence data was then accomplished in two stages, both performed on an IBM PC running the Red Hat Linux operating system and components of the BLAST program suite (Altschul et al., 1997). In the first stage, a library of DNA sequences was prepared from the known C. parasitica items in the NCBI database combined with unpublished sequences from our laboratory. These sequences were then compared, using the BLASTN algorithm, to all of the unknown individual EST sequences (not contigs) so that clones representing known C. parasitica genes could be identified. The second stage used the EST sequences in a BLASTX analysis against the entire non-redundant protein database as available from NCBI. Tables that comprised lists of the clones and their matches, together with NCBI information and alignments for each hit, were generated from the BLAST output using BioPerl (http://www.bioperl.org) software tools. Gene ontology (GO) information was manually added using the Gene Ontology Consortium's Amigo browser (http://www.godatabase.org/cgi-bin/go.cgi). The entire collection of redundant sequences has been submitted to the NCBI (accession numbers for dbEST, 17474474–17478690, corresponding to GenBank numbers CB686454CB690670) and are also publicly searchable at the COGEME phytopathogen database (cogeme.ex.ac.uk).

Genomic sequence analyses.
An EcoRI–EcoRI fragment approximately 15 kb in length isolated during characterization of the C. parasitica G-protein {beta}-subunit gene (cpgb-1; Kasahara & Nuss, 1997) was subcloned into pBluescript II SK+ (Stratagene). Following restriction mapping, this insert was divided into smaller sections and subcloned into pBluescript II SK+ or pUC18 to facilitate sequencing, with specific primers being designed to read sequence where required. Sequencing was performed at the CBR core facility and at Keio University, Tokyo, Japan. Assembly of the sequence information with reads in both directions was accomplished using the SeqMan utility of DNAStar. The single contig obtained was determined to accurately represent this region of the C. parasitica genome by comparing the location of the restriction sites predicted with those determined experimentally above. All genomic information for Neurospora crassa and Magnaporthe grisea was obtained from the publicly available databases at the Whitehead Institute/MIT Center for Genome Research (www-genome.wi.mit.edu). Data for Saccharomyces cerevisiae and Schizosaccharomyces pombe were obtained from publicly available information at the Saccharomyces genome resource (genome-www.stanford.edu/Saccharomyces/) and the Sanger Institute (http://www.sanger.ac.uk/Projects/S_pombe/), respectively.


   RESULTS AND DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Sequence analyses
Given the lack of sequence data available for C. parasitica, we embarked upon the creation of a library to permit larger-scale transcriptional profiling in future studies. A library designed to reflect the genes expressed in the presence or absence of hypovirus was prepared by mixing cDNA derived from two sources of mRNA: wild-type C. parasitica strain EP155 and EP155 infected with the prototypic virulence-attenuating hypovirus, CHV1-EP713 (strain EP713). This approach was taken to ensure representation within the library of genes whose expression was greatly reduced in either circumstance. Repeated rounds of infection with the phage library and colony picking were performed to generate between 10 and 20 new microtitre plates on each occasion until 60 plates were completed, representing 5760 cDNA clones. Thus catalogued by plate number and coordinate, plasmid minipreps were performed on each, except for plates 38–44, which generated consistently poor yields and were discarded. Six hundred and thirty-four additional colony losses were accounted for by poor growth, insufficient plasmid yield or handling error. By analysing PCR products or restriction-digested plasmid preparations from a random selection of 140 constructs, 95·7 % of the inserts were found to be larger than 500 bp, with the vast majority being in the 500–2000 bp range (data not shown). All sequence data were generated by single-pass reads of the insert from the 5' end. In one case (53F04) a chimeric insert was found to contain two unrelated cDNA sequences (cryparin and a sequence highly similar to kinesin-related protein KIF1C from N. crassa), but no other cloning anomalies were found. The mean read length after vector removal and quality analysis was 608 bp, often including poly(A) regions at the 3' end. Sequence files that were not included in the following analyses because they were determined to be of insufficient quality, or did not contain an insert of at least 100 bp, totalled 231. Following vector and quality trimming, a total of 2 703 587 bp of DNA sequence were included, representing data derived from 4216 clones. Assembly of this sequence information into contigs permitted an estimation of the number of individual sequences included. Totalling the number of contigs present under the particular match parameters as described (see Methods) indicated that 2200 individual gene transcripts were represented but, since the number of contigs obtained in this manner can fluctuate according to the exact matching parameters applied and there is the possibility that two clones may represent different portions of the same coding sequence without overlapping, this number can only be regarded as an approximate value. More than 1600 of these were present only once (unisequences), while the multiply-represented clones followed the distribution shown in Fig. 1(a), with only 28 contigs containing more than 10 clones each. Two of these contigs represented highly abundant mRNA species described below.



View larger version (16K):
[in this window]
[in a new window]
 
Fig. 1. (a) Distribution of the number of clones per contig following assembly. Omitted for clarity are the contigs corresponding to the two highly abundant sequences (see text) that contain more than 270 clones each. (b) Distribution of the individual hit E-values returned from the BLASTX analysis.

 
Galagan et al. (2003) reported that the genome size of N. crassa was approximately 40 Mb, similar to the ~45 Mb predicted for C. parasitica (Dr Bradley Hillman, Rutgers University, personal communication). Given the close phylogenetic relationship of these two fungi, it is reasonable to postulate that the number of genes in C. parasitica is comparable to the 10 082 currently predicted for N. crassa. Therefore, in terms of the genome, the collection described here probably represents between 20 and 25 % of the total gene coding potential of C. parasitica.

Identification of known C. parasitica sequences
A BLASTN analysis was performed using a specific collection of DNA sequences consisting of the publicly available C. parasitica information, as well as unpublished data from the Nuss laboratory, in comparison to our entire EST sequence data. In this manner we determined that 598 clones represented 13 known C. parasitica genes as detailed in Table 1, 12 from NCBI submitted data and one from unpublished information. Two cDNA sequences were present in far greater abundance than any others and represented almost 6 % of the total sequences each. These corresponded to ORF B of hypovirus CHV1-EP713 and cryparin, a secreted hydrophobin that has previously been shown to be highly expressed (Zhang et al., 1994). Of the 40 sequences from NCBI that were included in this analysis, the fact that 12 were contained within our collection correlates with the projection of approximately 25 % coverage above.


View this table:
[in this window]
[in a new window]
 
Table 1. Known C. parasitica sequences identified in the CEST collection by BLASTN

 
Assignment of putative function by BLASTX
putative identification of all of the unknown clones was achieved using the BLASTX software tool after removal of the known C. parasitica sequences. The remaining 3618 individual EST sequences were used for this analysis instead of contigs to avoid any problems created by artifacts from the contig assembly, although this approach lengthened processing time considerably. Once completed, the data were sorted using the expect (E) value as a measure of significance of the hit, with E>1x10-2 classed as insignificant. This yielded 3009 clones with hits and 609 with no significant hit. This list was then trimmed manually to remove multiple representatives of the same hit. Thus, 1734 individual hits were recovered distributed across a range of E values as described in Fig. 1(b). To further sort the data and provide some relevant functional information, this list was truncated to include only those 1255 examples whose hits returned an E value of 1x10-10 or less. Ontology information was then added manually to create two tables of potential groupings relating to both molecular function and biological process according to the definitions outlined by the Gene Ontology Consortium (www.geneontology.org). Both entire lists are available as electronic supplementary data with the online version of this article (mic.sgmjournals.org). It should be noted, however, that this provides only an indication of functional activity or relevance to a particular process, but does not reflect fully the non-redundant and dynamic nature of the GO database terms. These GO terms were developed such that one protein may well fit into several categories (Ashburner et al., 2000), thus necessitating a subjective assessment of the most appropriate single group for assignment in our studies. To examine the groups of sequences isolated, the complete dataset has been used to generate pie charts that illustrate the distribution of the hits obtained across the various molecular function and biological process (Fig. 2) categories. In both cases, a significant number (almost 30 %) of the hits related strongly to known genes of unknown function. In the case of molecular function, enzymes constituted a further 40·9 %, of which the hydrolases (27·6 %) and oxidoreductases (28·2 %) were most prevalent (Fig. 2a). Of the structural proteins identified, the majority (almost 80 %) were ribosomal subunits, with cytoskeleton and cell wall structural components making up a further 19 %. In terms of the biological processes with which these proteins are associated, more than two-thirds fell into the category of cell growth and maintenance, with far fewer grouped in cell communication (3·1 %) or development (0·8 %; Fig. 2b). Three-quarters of those sequences classified in the cell growth and maintenance cluster were involved in metabolic processes. From the combined observations in Fig. 2, we conclude that the C. parasitica sequences identified are broadly representative of the entire population of expressed genes in this organism.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 2. Pie chart representations of the significant hits returned by BLASTX analysis classified according to (a) Molecular Function, expanded to show further division of the Structural and Enzyme categories; and (b) Biological Process, expanded to show divisions within the Cell Growth and Maintenance category.

 
Identification of genes relevant to growth, development and pathogenicity of C. parasitica
Specific details of the following clones that have returned hits of potential interest are noted in Table 2. This table does not represent an exhaustive list of proteins from any functional category; for that information the reader is referred to the electronic supplementary data (mic.sgmjournals.org).


View this table:
[in this window]
[in a new window]
 
Table 2. Details of the putative protein identifications based on BLASTX analysis described in the text

 
The interaction between a pathogen and its host is dynamic, requiring the pathogen to continually adjust to changes in the microenvironment and to resistance factors produced by the host. Stress response proteins may fulfil roles to maintain homeostasis under these constantly variable conditions, thus contributing to virulence. This may be manifested in the adjustment of polyol accumulation to cope with hyperosmotic stress as is the case for Cladosporium fulvum (Clark et al., 2003) and M. grisea (Dixon et al., 1999) or production of superoxide dismutases by the human pathogen Candida albicans to protect against oxidative stress (Hwang et al., 2002). Stress-response proteins account for 1 % of the hits classified as cell growth and maintenance (Fig. 2) and include a halotolerance protein from Sch. pombe involved in inositol metabolism, as well as superoxide dismutases from Claviceps purpurea, Glomerella cingulata, both plant pathogens, and A. nidulans. Furthermore, we noted the presence of genes corresponding to a general stress-response protein STI35 identified in several species of Fusarium (Choi et al., 1990), and a protein termed GRG-1 from Podospora anserina that is specifically enhanced under conditions of carbon starvation (Kimpel & Osiewacz, 1999). These sequences may represent genes involved in mechanisms of resistance and/or response to host factors that influence phytopathogenicity of C. parasitica.

Given the dynamic host–pathogen interface, the importance of pathways that transmit signals from the cell surface of the pathogen to the nucleus cannot be overstated. Our laboratory has previously identified a number of signal transduction components, including the G-proteins CPG-1, CPG-2 (Choi et al., 1995), CPG-3 (Parsley et al., 2003) and CPGB-1 (Kasahara & Nuss, 1997) as well as other proteins required for their function such as BDM-1 (Kasahara et al., 2000) and a regulator of G{alpha}, RGS (G. C. Segers & D. L. Nuss, unpublished). Identifying specific sequences of interest in our collection, we have also noted a number of other genes that appear to be involved in perception and transfer of signals. These include additional internal transducers: examples of the small GTPase RHO family, a RAB family member, Drab11, and a RAN-binding protein (Sbp1p) probably involved in cell cycle control. Other genes of predicted importance for growth include DFG5, a glycosylphosphatidylinositol (GPI)-anchored protein essential for cell wall biogenesis in S. cerevisiae (Kitagaki et al., 2002) and two septins. Similarly, cellular structural components were found (making up 11·6 % of the structural protein category, Fig. 2a), including tubulin and actin as well as components that regulate their organization and biogenesis such as Dam1p, Cct7p and the Arp2/3 complex. Chen et al. (1996) noted the alteration of cAMP levels in response to hypovirus infection; therefore the identification of a RAS-interacting protein (RIP3) that, in Dictyostelium discoideum, is required for chemotaxis and cAMP-mediated signal transduction (Lee et al., 1999) is potentially relevant. Most interestingly, we have also identified an orthologue of the Ras guanine nucleotide exchange factor (RasGEF) AleA, a likely functional partner of RIP3. Together, these two components may form part of a C. parasitica RAS-regulated pathway as has been postulated for D. discoideum (Lee et al., 1999).

During growth of C. parasitica on solid medium, rings of developmentally distinct regions become apparent, areas that are primarily the sites of asexual sporulation and pigmentation. The frequency of the occurrence of these bands corresponds to the light cycle in which the culture is being grown and thus provides evidence of the light-responsive nature of these processes. In N. crassa, substantial advances have been made in studying the mechanism of light response and maintenance of circadian rhythms upon which similar developmental steps including conidiation are dependent (reviewed by Dunlap, 1996). One key component of the process is the PAS protein VVD (Heintzen et al., 2001). We have identified an orthologue of this gene as well as a clock-controlled gene whose expression in N. crassa is dependent on the clock cycle (Bell-Pedersen et al., 1996) and a cDNA similar to the nop1 gene of N. crassa that encodes a putative rhodopsin orthologue (Bieszke et al., 1999). In an additional effort unrelated to the C. parasitica EST library, we have also recovered a partial sequence of the transcriptional regulator FRQ from C. parasitica by degenerate PCR (A. L. Dawe & D. L. Nuss, unpublished observations). Further study of the mechanisms of light response and circadian rhythm maintenance may provide insights into the importance of these developmental regulatory pathways for fungal plant pathogenicity.

Our studies are also predicated in part upon efforts to enhance the naturally occurring virulence attenuation characteristics of the virus family Hypoviridae as a method of biological control (reviewed by Dawe & Nuss, 2001). Effective dissemination of the hypovirus through a native population requires the passing of the dsRNA elements from infected to non-infected fungus by means of anastomosis (hyphal fusion: Anagnostakis & Day, 1979). This process is dependent upon the vegetative incompatibility (vic) loci of each strain, and virus transmission can be greatly inhibited by heteroallelism between pairs (Cortesi et al., 2001). At the time of writing, public information concerning the sequence of the vic genes was unavailable, but we have isolated five new C. parasitica sequences in this study that appear to be related to this important process. Genetic analysis of the fungus P. anserina has identified at least 10 loci (het-c, -d and -e; mod-a, -d and -e; idi-1, -2 and -3; pspA), as being involved in the vegetative incompatibility reaction (reviewed by Glass & Kaneko, 2003) and their characterization has revealed proteins probably involved in signal transduction and membrane structural modifications. The HET-C protein has recently been shown to increase glycosphingolipid transfer rates (Mattjus et al., 2003) and also is involved in growth and sporulation (Saupe et al., 1994). Our analysis yielded three EST clones corresponding to het-c. Complete sequencing of these inserts from both 5' and 3' ends has enabled us to construct a complete predicted amino acid sequence of the C. parasitica HET-C orthologue. When compared to the sequence for het-c of P. anserina and the predicted protein of N. crassa locus NCU07947.1, significant identity throughout the polypeptides is clear (Fig. 3). Each protein is 53–63 % identical to the other two, suggesting considerable conservation of incompatibility components between these fungi. This is further supported by the presence of C. parasitica ESTs corresponding to the vegetative incompatibility proteins HET-D, a WD-40 domain protein of unknown function from P. anserina (Espagne et al., 2002) and HET-6 of N. crassa, whose function is also presently unspecified. The mod genes of P. anserina are believed to code for signalling components that mediate the incompatibility response that can lead to programmed cell death. MOD-A and MOD-E were represented in the C. parasitica library and a sixth C. parasitica gene, the G{alpha} subunit CPG-2 identified by Choi et al. (1995), may also be relevant since it is closely related to the MOD-D protein (Loubradou et al., 1999). In contrast to the CPG-1 {alpha} subunit, which caused extensive phenotypic changes when absent, deletion of cpg-2 caused only minor changes from wild-type and therefore was not required for pigmentation, sporulation or virulence (Choi et al., 1995). Further analysis of the clones identified in this study may, in conjunction with characterized loci in C. parasitica and other organisms, provide additional insights into this phenomenon and its relevance to the engineering of hypoviruses for enhanced potential as biological control agents.



View larger version (42K):
[in this window]
[in a new window]
 
Fig. 3. PILEUP comparison of HET-C protein sequences from P. anserina (P.a. HET-C; gi number 537934), N. crassa (NCU07947.1) and C. parasitica (C.p. HET-C; derived from three completely sequenced cDNA inserts). Only those amino acids conserved between all three proteins are highlighted, with dark blocks representing identical residues and grey blocks similar residues.

 
A representation of the genera of the organisms from which the best BLASTX results for the EST clones were derived is given in Fig. 4. To avoid skewing of the data due to the presence of large numbers of hits from highly analysed genomes, we have scored each genus only once. Thus, the pie charts depict a breakdown of the 132 individual genera represented in the hits analysed above. As expected, most of the highly homologous hits are obtained with sequences from ascomycetous fungi. Narrowing the list to follow the taxonomy of C. parasitica, it is evident that most of the hits are from phylogenetically related organisms. The final grouping represents the Sordariomycetes and contains many phytopathogenic fungi including the genera Claviceps, Glomerella, Gibberella, Nectria, Ophiostoma, Rosellinia and Magnaporthe. N. crassa, while not a pathogen, is also a member of this group. Of the 33 genera from the Pezizomycotina group of ascomycetes that yielded BLASTX hits of E<1x10-10, 16 are plant pathogens and a further six are pathogens of other organisms. Continued study of the sequence data presented here may well illuminate specific pathways and genes that are common between pathogenic fungi and required for virulence.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 4. A pie chart representation of the distribution of significant hits returned by BLASTX analysis, classified according to the taxonomy of the organism from which the hit was derived. Each different genus was only scored once to avoid bias from large sequencing projects.

 
Genomic microsynteny among Sordariomycetes
During investigations in our laboratory of signal transduction pathways that modulate virulence in C. parasitica, we have used the publicly available information to compare C. parasitica genes with those present in the N. crassa genome. Our previous studies had identified CPGB-1, a G-protein {beta}-subunit (Kasahara & Nuss, 1997), and a novel phosducin-like protein called BDM-1 (Kasahara et al., 2000). While the exact roles of these two proteins have not been determined, it is clear from studies in C. parasitica that BDM-1 is essential for correct CPGB-1-mediated signalling (Kasahara et al., 2000), most likely through a direct interaction of the two proteins (A. L. Dawe & D. L. Nuss, unpublished observations), and that the pathway(s) affected subsequently modulate a wide variety of cellular processes including growth, pigmentation, sporulation and virulence. Similar developmental importance has been ascribed to the gnb-1 gene of N. crassa (Yang et al., 2002). Examination of the N. crassa genomic data revealed a gene, locus NCU00441.1, that appeared similar to bdm-1. Furthermore, this locus was situated in close proximity to gnb-1 (spanning a region of approximately 7 kb) and the ORFs were oriented in the same direction (Fig. 5a). Returning to the original {lambda} phage clone from which the genomic sequence of cpgb-1 was obtained, we were able to establish that part of the ORF of bdm-1 was present at the opposite end to cpgb-1 (data not shown). Subsequent shotgun sequencing of this entire {lambda} clone was then correlated with existing genomic data of the two genes and their immediate surroundings. A single contig of 12982 bp comprising 51 sequences (23 for the top strand, 28 for the bottom, resulting in 2·5-fold average coverage over the entire length) was found to contain the two genes oriented in the same manner as N. crassa (Fig. 5b).



View larger version (18K):
[in this window]
[in a new window]
 
Fig. 5. A region of microsynteny between three Sordariomycetes: conserved genomic arrangement of two genes that encode a G-protein {beta}-subunit and a phosducin-like potential regulator of G{beta} signalling in (a) N. crassa, (b) C. parasitica and (c) M. grisea.

 
Regions of microsynteny have been previously noted between M. grisea and N. crassa, wherein orientations of ORFs and relative locations were maintained (Hamer et al., 2001). Therefore, further comparison was also made with available data for M. grisea, although the genomic information from release 2.1 did not represent a complete genome assembly (Whitehead Institute/MIT Center for Genome Research, January 2003). However, the orthologues of the G{beta} subunit (locus MG05201.1; also known as Mgb1, NCBI gi number 21624377) and bdm-1 (locus MG05200.1) were found to be located on adjacent (linked) cosmids (Fig. 5c). The distance between the two genes cannot be accurately determined until the intervening sequence is released, but the relative orientation of the two reading frames appeared to be identical to that of C. parasitica and N. crassa. Using the COGEME phytopathogen EST database (cogeme.ex.ac.uk/) we have also noted the presence of EST sequences corresponding to orthologues of G{beta} (clone Bc-con2700) and BDM-1 (clone Bc-con1300) from the plant pathogen Botryotinia fuckeliana. Presently, appropriate genome information is not available but it would be our prediction that the arrangement of the BDM-1 and G{beta} genes is maintained in this organism also. While published data on the bdm-1 orthologues are not available for either M. grisea or N. crassa, it seems likely that the protein products of these genes will prove important to the correct function of the G{beta} subunit given that this consistent arrangement of the two genes is maintained so rigorously.

The synteny appears absent in the genomes of S. cerevisiae and Sch. pombe, since the two genes of interest are at least 100 kb and 40 kb apart, respectively (data not shown). Furthermore, of the two phosducin-like proteins Plp1p (gi number 6320389; most similar to BDM-1) and Plp2p (gi number 6324856) in S. cerevisiae (Flanary et al., 2000), PLP2 is located closer to the {beta}-subunit STE4. We have also noted an EST clone (26C10, tentatively termed bdm-2) from C. parasitica that appears to correspond to PLP2 (E=9x10-34), a gene that gave a lethal phenotype when knocked out in S. cerevisiae (Flanary et al., 2000). Examining the genome data available for N. crassa and M. grisea we have determined that sequences that resemble PLP2/bdm-2 are present in both of these fungi also (NCU00617.1 and MG02871.1, respectively) but are located on different linkage groups and supercontigs. Therefore, while both Plp1p and Plp2p were characterized as phosducin-like proteins from S. cerevisiae, it appears as if the functional relationship of the Plp2p family to G{beta} subunits may be quite different from that which has been hypothesized for BDM-1 (Kasahara et al., 2000) and Plp1p (Flanary et al., 2000).

Of additional note, several N. crassa EST clones were found to map to a region approximately 700 bp downstream of the G{beta} subunit, but comparison with the entire cDNA fragment of gnb-1 (Yang et al., 2002) indicated that they represented an untranslated region of this ORF (data not shown). There are no other ESTs or predicted ORFs that map to this region of the N. crassa genome. Similarly, we have not isolated any C. parasitica ESTs that correspond to the equivalent region in this organism. Therefore, it appears as if the two genes are maintained in this arrangement across genera without any intervening genes. Additional comparison of the intervening regions of N. crassa and C. parasitica failed to indicate any conserved regions of sequence identity or similarity (data not shown). This suggests that the physical proximity of the two genes is the most important feature, perhaps reflecting a requirement for co-segregation of these loci.

Conclusions
We have catalogued the largest collection of expressed genes thus far produced for C. parasitica, resulting in approximately 1·3 million bp of previously unknown sequence. We have demonstrated that the sequences represented a wide variety of molecular functions involved in many biological processes. The acquired data pointed to considerable similarity of expressed genes between related species and suggested that this may extend to conserved genomic arrangements. With the available database created in this study, used in conjunction with microarray technology, we now have the opportunity to explore on a large scale the transcriptional changes that occur during events such as hypovirus infection. This will undoubtedly lead to enhanced understanding of hypovirus-mediated modulation of host gene expression as well as provide opportunities for future studies of genes that are likely to be critical for fungal plant pathogenesis.


   ACKNOWLEDGEMENTS
 
The authors would like to thank Steve Screen for assistance with setting up the BLAST utilities, Todd Allen for critical evaluation of the manuscript, and Ellen Rosenbloom and Sharmili Mathur for technical assistance. This work was supported in part by Public Health Service grant number GM55981 to D. L. N.


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.[Abstract/Free Full Text]

Anagnostakis, S. L. & Day, P. R. (1979). Hypovirulence conversion in Endothia parasitica. Phytopathology 69, 1226–1229.

Ashburner, M., Ball, C. A., Blake, J. A. & 17 other authors (2000). Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25, 25–29.[CrossRef][Medline]

Balasubramanian, M. K., Feoktistova, A., McCollum, D. & Gould, K. L. (1996). Fission yeast Sop2p: a novel and evolutionarily conserved protein that interacts with Arp3p and modulates profilin function. EMBO J 15, 6426–6437.[Abstract]

Barreau, C., Iskandar, M., Loubradou, G., Levallois, V. & Begueret, J. (1998). The mod-A suppressor of nonallelic heterokaryon incompatibility in Podospora anserina encodes a proline-rich polypeptide involved in female organ formation. Genetics 149, 915–926.[Abstract/Free Full Text]

Beauvais, A., Bruneau, J. M., Mol, P. C., Buitrago, M. J., Legrand, R. & Latge, J. P. (2001). Glucan synthase complex of Aspergillus fumigatus. J Bacteriol 183, 2273–2279.[Abstract/Free Full Text]

Bell-Pedersen, D., Shinohara, M. L., Loros, J. J. & Dunlap, J. C. (1996). Circadian clock-controlled genes isolated from Neurospora crassa are late night- to early morning-specific. Proc Natl Acad Sci U S A 93, 13096–13101.[Abstract/Free Full Text]

Bennett, J. W. & Arnold, J. (2001). Genomics for fungi. In The Mycota VIII: Biology of the Fungal Cell, pp. 267–297. Edited by R. J. Howard & N. A. R. Gow. Berlin: Springer-Verlag.

Bieszke, J. A., Braun, E. L., Bean, L. E., Kang, S., Natvig, D. O. & Borkovich, K. A. (1999). The nop-1 gene of Neurospora crassa encodes a seven transmembrane helix retinal-binding protein homologous to archaeal rhodopsins. Proc Natl Acad Sci U S A 96, 8034–8039.[Abstract/Free Full Text]

Chen, B., Gao, S., Choi, G. H. & Nuss, D. L. (1996). Extensive alteration of fungal gene transcript accumulation and elevation of G-protein-regulated cAMP levels by a virulence-attenuating hypovirus. Proc Natl Acad Sci U S A 93, 7996–8000.[Abstract/Free Full Text]

Choi, G. H. & Nuss, D. L. (1990). Nucleotide sequence of the glyceraldehyde-3-phosphate dehydrogenase gene from Cryphonectria parasitica. Nucleic Acids Res 18, 5566.[Medline]

Choi, G. H., Marek, E. T., Schardl, C. L., Richey, M. G., Chang, S. Y. & Smith, D. A. (1990). sti35, a stress-responsive gene in Fusarium spp. J Bacteriol 172, 4522–4528.[Medline]

Choi, G. H., Larson, T. G. & Nuss, D. L. (1992). Molecular analysis of the laccase gene from the chestnut blight fungus and selective suppression of its expression in an isogenic hypovirulent strain. Mol Plant–Microbe Interact 5, 119–128.[Medline]

Choi, G. H., Chen, B. & Nuss, D. L. (1995). Virus-mediated or transgenic suppression of a G-protein alpha subunit and attenuation of fungal virulence. Proc Natl Acad Sci U S A 92, 305–309.[Abstract]

Clark, A. J., Blissett, K. J. & Oliver, R. P. (2003). Investigating the role of polyols in Cladosporium fulvum during growth under hyper-osmotic stress and in planta. Planta 216, 614–619.[Medline]

Cortesi, P., McCulloch, C. E., Song, H., Lin, H. & Milgroom, M. G. (2001). Genetic control of horizontal virus transmission in the chestnut blight fungus, Cryphonectria parasitica. Genetics 159, 107–118.[Abstract/Free Full Text]

Dawe, A. L. & Nuss, D. L. (2001). Hypoviruses and chestnut blight: exploiting viruses to understand and modulate fungal pathogenesis. Annu Rev Genet 35, 1–29.[CrossRef][Medline]

Diez, B., Velasco, J., Marcos, A. T., Rodriguez, M., de la Fuente, J. L. & Barredo, J. L. (2000). The gene encoding gamma-actin from the cephalosporin producer Acremonium chrysogenum. Appl Microbiol Biotechnol 54, 786–791.[CrossRef][Medline]

Dixon, K. P., Xu, J. R., Smirnoff, N. & Talbot, N. J. (1999). Independent signaling pathways regulate cellular turgor during hyperosmotic stress and appressorium-mediated plant infection by Magnaporthe grisea. Plant Cell 11, 2045–2058.[Abstract/Free Full Text]

Dunlap, J. C. (1996). Genetics and molecular analysis of circadian rhythms. Annu Rev Genet 30, 579–601.[CrossRef][Medline]

Espagne, E., Balhadere, P., Penin, M. L., Barreau, C. & Turcq, B. (2002). HET-E and HET-D belong to a new subfamily of WD40 proteins involved in vegetative incompatibility specificity in the fungus Podospora anserina. Genetics 161, 71–81.[Abstract/Free Full Text]

Flanary, P. L., DiBello, P. R., Estrada, P. & Dohlman, H. G. (2000). Functional analysis of Plp1 and Plp2, two homologues of phosducin in yeast. J Biol Chem 275, 18462–18469.[Abstract/Free Full Text]

Galagan, J. E., Calvo, S. E., Borkovich, K. A. & 74 other authors (2003). The genome sequence of the filamentous fungus Neurospora crassa. Nature 422, 859–868.[CrossRef][Medline]

Glass, N. L. & Kaneko, I. (2003). Fatal attraction: nonself recognition and heterokaryon incompatibility in filamentous fungi. Eukaryot Cell 2, 1–8.[Free Full Text]

Hamer, L., Pan, H., Adachi, K., Orbach, M. J., Page, A., Ramamurthy, L. & Woessner, J. P. (2001). Regions of microsynteny in Magnaporthe grisea and Neurospora crassa. Fungal Genet Biol 33, 137–143.[CrossRef][Medline]

He, X., Hayashi, N., Walcott, N. G., Azuma, Y., Patterson, T. E., Bischoff, F. R., Nishimoto, T. & Sazer, S. (1998). The identification of cDNAs that affect the mitosis-to-interphase transition in Schizosaccharomyces pombe, including sbp1, which encodes a spi1p-GTP-binding protein. Genetics 148, 645–656.[Abstract/Free Full Text]

Heckmann, S., Schliwa, M. & Kube-Granderath, E. (1997). Primary structure of Neurospora crassa gamma-tubulin. Gene 199, 303–309.[CrossRef][Medline]

Heintzen, C., Loros, J. J. & Dunlap, J. C. (2001). The PAS protein VIVID defines a clock-associated feedback loop that represses light input, modulates gating, and regulates clock resetting. Cell 104, 453–464.[Medline]

Hwang, C. S., Rhie, G. E., Oh, J. H., Huh, W. K., Yim, H. S. & Kang, S. O. (2002). Copper- and zinc-containing superoxide dismutase (Cu/ZnSOD) is required for the protection of Candida albicans against oxidative stresses and the expression of its full virulence. Microbiology 148, 3705–3713.[Abstract/Free Full Text]

Jara, P., Gilbert, S., Delmas, P., Guillemot, J. C., Kaghad, M., Ferrara, P. & Loison, G. (1996). Cloning and characterization of the eapB and eapC genes of Cryphonectria parasitica encoding two new acid proteinases, and disruption of eapC. Mol Gen Genet 250, 97–105.[CrossRef][Medline]

Kamoun, S., Hraber, P., Sobral, B., Nuss, D. & Govers, F. (1999). Initial assessment of gene diversity for the oomycete pathogen Phytophthora infestans based on expressed sequences. Fungal Genet Biol 28, 94–106.[CrossRef][Medline]

Kasahara, S. & Nuss, D. L. (1997). Targeted disruption of a fungal G-protein beta subunit gene results in increased vegetative growth but reduced virulence. Mol Plant–Microbe Interact 10, 984–993.[Medline]

Kasahara, S., Wang, P. & Nuss, D. L. (2000). Identification of bdm-1, a gene involved in G protein beta-subunit function and alpha-subunit accumulation. Proc Natl Acad Sci U S A 97, 412–417.[Abstract/Free Full Text]

Keon, J., Bailey, A. & Hargreaves, J. (2000). A group of expressed cDNA sequences from the wheat fungal leaf blotch pathogen, Mycosphaerella graminicola (Septoria tritici). Fungal Genet Biol 29, 118–133.[CrossRef][Medline]

Kim, S., Ahn, I. P. & Lee, Y. H. (2001). Analysis of genes expressed during rice-Magnaporthe grisea interactions. Mol Plant–Microbe Interact 14, 1340–1346.[Medline]

Kimpel, E. & Osiewacz, H. D. (1999). PaGrg1, a glucose-repressible gene of Podospora anserina that is differentially expressed during lifespan. Curr Genet 35, 557–563.[CrossRef][Medline]

Kitagaki, H., Wu, H., Shimoi, H. & Ito, K. (2002). Two homologous genes, DCW1 (YKL046c) and DFG5, are essential for cell growth and encode glycosylphosphatidylinositol (GPI)-anchored membrane proteins required for cell wall biogenesis in Saccharomyces cerevisiae. Mol Microbiol 46, 1011–1022.[CrossRef][Medline]

Lee, S., Parent, C. A., Insall, R. & Firtel, R. A. (1999). A novel Ras-interacting protein required for chemotaxis and cyclic adenosine monophosphate signal relay in Dictyostelium. Mol Biol Cell 10, 2829–2845.[Abstract/Free Full Text]

Loubradou, G., Begueret, J. & Turcq, B. (1997). A mutation in an HSP90 gene affects the sexual cycle and suppresses vegetative incompatibility in the fungus Podospora anserina. Genetics 147, 581–588.[Abstract/Free Full Text]

Loubradou, G., Begueret, J. & Turcq, B. (1999). MOD-D, a G-alpha subunit of the fungus Podospora anserina, is involved in both regulation of development and vegetative incompatibility. Genetics 152, 519–528.[Abstract/Free Full Text]

Mattjus, P., Turcq, B., Pike, H. M., Molotkovsky, J. G. & Brown, R. E. (2003). Glycolipid intermembrane transfer is accelerated by HET-C2, a filamentous fungus gene product involved in the cell-cell incompatibility response. Biochemistry 42, 535–542.[CrossRef][Medline]

Merkel, H. W. (1906). A deadly fungus on the American chestnut. N Y Zool Soc Annu Rep 10, 97–103.

Momany, M. & Hamer, J. E. (1997). The Aspergillus nidulans septin encoding gene, aspB, is essential for growth. Fungal Genet Biol 21, 92–100.[CrossRef][Medline]

Nakano, K. & Mabuchi, I. (1995). Isolation and sequencing of two cDNA clones encoding Rho proteins from the fission yeast Schizosaccharomyces pombe. Gene 155, 119–122.[CrossRef][Medline]

Parsley, T. B., Chen, B., Geletka, L. M. & Nuss, D. L. (2002). Differential modulation of cellular signaling pathways by mild and severe hypovirus strains. Eukaryot Cell 1, 401–413.[Abstract/Free Full Text]

Parsley, T. B., Segers, G., Nuss, D. L. & Dawe, A. L. (2003). Analysis of altered G-protein subunit accumulation in Cryphonectria parasitica reveals a third G-{alpha} homologue. Curr Genet 43, 24–33.[Medline]

Rasmussen, S. W. (1995). A 37·5 kb region of yeast chromosome X includes the SME1, MEF2, GSH1 and CSD3 genes, a TCP-1-related gene, an open reading frame similar to the DAL80 gene, and a tRNA(Arg). Yeast 11, 873–883.[Medline]

Razanamparany, V., Jara, P., Legoux, R., Delmas, P., Msayeh, F., Kaghad, M. & Loison, G. (1992). Cloning and mutation of the gene encoding endothiapepsin from Cryphonectria parasitica. Curr Genet 21, 455–461.[Medline]

Saupe, S., Descamps, C., Turcq, B. & Begueret, J. (1994). Inactivation of the Podospora anserina vegetative incompatibility locus het-c, whose product resembles a glycolipid transfer protein, drastically impairs ascospore production. Proc Natl Acad Sci U S A 91, 5927–5931.[Abstract]

Shapira, R., Choi, G. H. & Nuss, D. L. (1991). Virus-like genetic organization and expression strategy for a double-stranded RNA genetic element associated with biological control of chestnut blight. EMBO J 10, 731–739.[Abstract]

Singh, G., Sinha, H. & Ashby, A. M. (2000). Cloning and expression studies during vegetative and sexual development of Pbs1, a septin gene homologue from Pyrenopeziza brassicae. Biochim Biophys Acta 1497, 168–174.[Medline]

Skinner, W., Keon, J. & Hargreaves, J. (2001). Gene information for fungal plant pathogens from expressed sequences. Curr Opin Microbiol 4, 381–386.[CrossRef][Medline]

Smith, M. L., Micali, O. C., Hubbard, S. P., Mir-Rashed, N., Jacobson, D. J. & Glass, N. L. (2000). Vegetative incompatibility in the het-6 region of Neurospora crassa is mediated by two linked genes. Genetics 155, 1095–1104.[Abstract/Free Full Text]

Soanes, D. M., Skinner, W., Keon, J., Hargreaves, J. & Talbot, N. J. (2002). Genomics of phytopathogenic fungi and the development of bioinformatic resources. Mol Plant–Microbe Interact 15, 421–427.[Medline]

Takano, Y., Oshiro, E. & Okuno, T. (2001). Microtubule dynamics during infection-related morphogenesis of Colletotrichum lagenarium. Fungal Genet Biol 34, 107–121.[CrossRef][Medline]

Thomas, S. W., Rasmussen, S. W., Glaring, M. A., Rouster, J. A., Christiansen, S. K. & Oliver, R. P. (2001). Gene identification in the obligate fungal pathogen Blumeria graminis by expressed sequence tag analysis. Fungal Genet Biol 33, 195–211.[CrossRef][Medline]

Wang, P., Larson, T. G., Chen, C. H., Pawlyk, D. M., Clark, J. A. & Nuss, D. L. (1998). Cloning and characterization of a general amino acid control transcriptional activator from the chestnut blight fungus Cryphonectria parasitica. Fungal Genet Biol 23, 81–94.[CrossRef][Medline]

Wood, V., Gwilliam, R., Rajandream, M. A. & other authors (2002). The genome sequence of Schizosaccharomyces pombe. Nature 415, 871–880.[CrossRef][Medline]

Yang, Q., Poole, S. I. & Borkovich, K. A. (2002). A G-protein beta subunit required for sexual and vegetative development and maintenance of normal G{alpha} protein levels in Neurospora crassa. Eukaryot Cell 1, 378–390.[Abstract/Free Full Text]

Zhang, L., Villalon, D., Sun, Y., Kazmierczak, P. & Van Alfen, N. K. (1994). Virus-associated down-regulation of the gene encoding cryparin, an abundant cell-surface protein from the chestnut blight fungus, Cryphonectria parasitica. Gene 139, 59–64.[CrossRef][Medline]

Zhang, N. & Blackwell, M. (2001). Molecular phylogeny of dogwood anthracnose fungus (Discula destructiva) and the Diaporthales. Mycologia 93, 355–365.

Received 28 March 2003; revised 12 June 2003; accepted 13 June 2003.