* Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research and Development Center, Wooster, Ohio; Plant-Pathogen Interaction Programme, Scottish Crop Research Institute, Invergowrie, Dundee DD2 5DA, United Kingdom
Correspondence: E-mail: kamoun.1{at}osu.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: diversifying selection Phytophthora infestans virulence avirulence cysteine-rich host-microbe interactions
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The neutral theory of molecular evolution maintains that most molecular polymorphisms within a species and most molecular divergence between species are driven by random fixation of selectively neutral mutations (Kimura 1983). By investigating the prevalence of nucleotide polymorphism and divergence, it is possible to obtain considerable insight into the evolutionary processes that shaped a particular genomic region (Hudson 1993). In the last decade, the genetic architecture of polymorphisms within a species and species divergence have been widely studied (Hughes, Ota, and Nei 1990; Karl and Avise 1992; Orr and Coyne 1992; Berry and Kreitman 1993; Haag and True 2001; Wu 2001). Many researchers have focused on identifying genes and finding genomic regions of functional importance on which selection has acted, thus helping to unravel the evolutionary genetic basis of ecological diversification. For example, diversifying selection (also known as positive selection) can be an indicator of genomic regions that contain genes or gene families of functional importance.
The most reliable indicator of diversifying selection at the molecular level is a higher nonsynonymous nucleotide-substitution rate (dN) than synonymous nucleotide-substitution rate (dS) between two protein-coding DNA sequences (ratio = dN / dS > 1) (Li, Wu, and Luo 1985; Nei and Gojobori 1986; Ina 1995; Yang and Bielawski 2000). Based on this criterion, statistical methods, such as the approximate method (also known as the counting method) and the maximum likelihood (ML) method, have been developed and implemented into computer software packages for detecting diversifying selection (Yang and Bielawski 2000). On the basis of such methods, a number of genes involved in defense systems or immunity, genes involved in evading defense systems or immunity, and toxin protein genes have been shown to be under diversifying selection (Stahl and Bishop 2000; Yang and Bielawski 2000).
In plant pathogen interactions, resistance is often regulated by recognition of pathogen molecules by the plant. This is illustrated by the gene-for-gene concept, which implies that an avirulence (Avr) gene from the pathogen is recognized directly or indirectly by a matching resistance (R) gene from the plant, resulting in recognition of the pathogen and activation of plant defense mechanisms (Dangl and Jones 2001). Diversifying selection in genes encoding proteins that function at the interface of attack and defense in host-pathogen antagonism, such as the Avr and R genes, is likely to reflect an "arms race" co-evolution (Thomas and Stephen 1999, 2000; Stahl and Bishop 2000). The rationale is that natural selection driven by a co-evolutionary arms race is likely to leave a signature at the molecular level. Thus, evolutionary analyses of defense or attack (virulence) genes can provide insight into how plants and pathogens co-evolve under the "arms race" model, and the extent to which co-evolutionary interactions shape the present genetic variation in plant and pathogen populations (Stahl and Bishop 2000).
P. infestans research has entered the genomics era. Current genomic resources include expressed sequence tags (ESTs) from a variety of developmental and infection stages, as well as sequences of selected regions of the genome (Kamoun 2003). A number of data-mining and functional strategies have been developed to exploit the sequence resources. For example, Torto et al. (2003) developed an algorithm to identify putative extracellular effector proteins from EST data sets. Bos et al. (2003) described a strategy to identify candidate Avr genes based on the assumption that these genes exhibit significant sequence variation within populations of the pathogen. Accumulation of structural genomic resources, genome sequences for P. infestans, and the availability of appropriate statistical methodologies provide the opportunity to investigate patterns of diversifying selection in effector proteins from P. infestans. Effector proteins are molecules produced by plant pathogens to manipulate biochemical and physiological processes in their host plants by promoting infection (virulence genes) or by triggering defense responses (Avr genes) (Torto et al. 2003). Based on the assumption that evidence of diversifying selection in effector genes could reflect an "arms race" co-evolution between the host and the pathogen, we hypothesized that identifying P. infestans genes under diversifying selection will augment other criteria to help us select candidate effector genes important in virulence and host specificity.
In this study, we mined EST data from infection stages of P. infestans for secreted and potentially polymorphic genes. One class of genes, identified by Torto et al. (2003), encodes secreted small cysteine-rich (SCR) proteins, a feature reminiscent of the products of Avr genes from plant pathogenic fungi and oomycetes (van't Slot and Knogge 2002; Bittner-Eddy et al. 2003). One of these genes, scr74, encodes a predicted protein with significant similarity to PcF, a 52 amino acid phytotoxic necrosis-inducing protein secreted by Phytophthora cactorum (Orsomando et al. 2001). Further characterization of the scr74 gene suggested that it is upregulated during colonization of tomato and potato by P. infestans and forms a highly polymorphic gene family. We investigated the molecular evolution of the scr74 genes by means of the approximate method of Nei and Gojobori (1986), which calculates the average ratio across all the amino acid sites. In addition, we used the ML method to identify particular amino acid residues on which diversifying selection has acted (Nielsen and Yang 1998; Yang and Bielawski 2000). Results showed that diversifying selection likely caused the extensive polymorphism observed within the scr74 gene family. Based on this and additional analyses of gene copy number and organization, we propose an evolutionary model that involves duplication followed by functional divergence of scr74 genes. This study provides support for using diversifying selection as a criterion for identifying candidate effector genes from sequence databases.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
RNA Manipulations, Northern Blot Analysis, and Real-Time Reverse Transcriptase Polymerase Chain Reaction (RT-PCR)
Total RNA isolation from P. infestans mycelium and from infected tomato leaves, as well as northern blot hybridizations were carried out as described by Huitema et al. (2003). Total RNA extraction and cDNA synthesis for P. infestans mycelium, sporangia, zoospores, germinating cysts and uninfected and infected potato cultivar Bintje leaves, and SYBR green real-time RT-PCR assays were carried out as described by Avrova et al. (2003). Real-time RT-PCR primers for the constitutively expressed P. infestans control gene actA, and for the in plantainduced gene calA, are given in Avrova et al. (2003). Primers for scr74 were 5'-CCACGATTGCTGTGGTAAAAGTT-3' and 5'-TCGCTGTGGTTTGGAATCTAGA-3', and amplified a 72-bp fragment.
DNA Manipulations and Southern Blot Analysis
Total DNA samples from the 19 P. infestans isolates listed above were kindly provided by Dr. C. Smart (Cornell University) and were used for Southern blot analysis. DNA samples (15 µg each) were digested with Hind III restriction enzyme, and separated by gel electrophoresis on a 1% agarose gel in TBE buffer and transferred to Hybond N+ membranes (Amersham Biosciences Corp, Piscataway, N.J.) according to the manufacturer's instructions. Southern blot hybridizations were conducted at 65°C in Modified Church Buffer (0.36 M Na2HPO4, 0.14 MNaH2PO4, 1 mM ethylene diamine tetraacetic acid [EDTA], and 7% sodium dodecyl sulfate [SDS]). Filters were washed at 55°C in 1 x SSC/0.5% SDS, and 0.5 x SSC/0.5% SDS (Sambrook and Russell 2001). To reveal the hybridizing bands, membranes were exposed to a phosphor imager screen (Molecular Dynamics Storm 840 Phosphor Imager). Hybridizations to the P. infestans BAC library, BAC DNA isolation, BAC end-sequencing, and Southern blotting of BAC clones were performed as described by Whisson et al. (2001).
Hybridization Probes for Southern and Northern Blot Analysis
DNA inserts from cDNA clones of scr74 and actA were gel-purified after digestion and used as probes for Southern and northern blot hybridizations. The probes were radiolabeled with 32P-dATP using a random primer labeling kit (Invitrogen, San Diego, Calif.).
Primer Design, PCR Amplification, and Sequencing
A pair of oligonucleotide primers SCR74-FCla (5'-GGAAATCGATCCGGTCATCGTCACTACTCAACAGCTCG3') and SCR74-RNot (5'- GGAAGCGGCCGCTTCATTCATTTGATTATCACTGTATCTC3') were designed for the amplification of a 304-bp fragment containing the entire open reading frame (ORF) of scr74. The fragments were cloned in pGR106 (Lu et al. 2004) using the ClaI and NotI restriction enzymes. Five P. infestans isolates were used for PCR amplifications (table 1). Polymerase chain reaction amplifications and DNA sequencing were performed as described earlier (Bos et al. 2003). The sequences described here were deposited in GenBank (accession numbers AY723699AY723725).
|
Databases
We examined several sequence databases, including publicly available GenBank nonredundant databases and dBEST (Karsch-Mizrachi and Ouellette 2001), the Phytophthora Functional Genomics Database (PFGD, www.pfgd.org), and the Syngenta Phytophthora Consortium (SPC) database, a proprietary database of Syngenta Inc. containing ca. 75,000 ESTs from P. infestans.
Diversifying Selection Analyses
The rate of nonsynonymous nucleotide substitutions per nonsynonymous site (dN) and the rate of synonymous nucleotide substitutions per synonymous site (dS) across all the amino acids sites in pairwise comparisons between nucleotide sequences were estimated using the approximate method of Nei and Gojobori (1986) implemented in the YN00 program in the PAML software package (Yang 1997).
To identify which SCR74 amino acids have been affected by diversifying selection, we used maximum likelihood models of codon substitution that allow for heterogeneous selection pressures among sites along the protein (Nielsen and Yang 1998; Yang et al. 2000; Yang and Bielawski 2000). Analyses were done with the computer program CODEML in the PAML package (Yang 1997). This method consists of two major steps. The first step uses the likelihood ratio test (LRT) to test for diversifying selection by comparing a null model with an alternative one that accounts for sites under diversifying selection. The six models recommended by Yang et al. (2000) were tested. They were null models, M0, M1, M7, corresponding to alternative models, M3, M2, M8, respectively. Twice the difference in log likelihood ratio between a null model and an alternative model was compared with a chi-squared (2) distribution with degrees of freedom equaling the difference in the numbers of parameters estimated from the pair of models. The likelihood ratios of the two models test whether an alternative model fits the data better than the null model. The second step identifies amino acids under diversifying selection by using the empirical Bayes theorem, as implemented in CODEML, to calculate the posterior probability that a particular amino acid belongs to a given selection class (neutral, deleterious or advantageous) (Yang 1997). Amino acid sites with a high posterior probability for an advantageous class of sites (
> 1) were deemed more likely to be under diversifying selection.
Analysis of Genetic Recombination
Open reading frames of scr74 nucleotide sequence were translated into amino acid sequences, and multiple alignments were conducted using the CLUSTAL-X program (Thompson et al. 1997). Evidence for genetic recombination was sought with the SplitsTree program that uses the Split Decomposition method (Huson 1998). The difference in sum of squares (DSS) statistics (McGuire and Wright 2000) implemented in the TOPALi program (Milne et al. 2004) was used to further investigate recombination. Phylogenetic trees were constructed with the Neighbor-Joining method based on JukesCantor distances as implemented in TOPALi (Milne et al. 2004). Parametric bootstrapping using the DSS statistic was used to compare tree topologies (Goldman, Anderson, and Rodrigo 2000).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The scr74 Gene Is Upregulated During Infection of Tomato and Potato by P. infestans
To determine whether the scr74 gene is upregulated during infection, we used northern blot analysis to detect scr74 mRNA in a time course of P. infestans infection of its host plant tomato. Total RNA was isolated from leaves of tomato 0, 1, 2, 3, and 4 days post-inoculation (dpi) with P. infestans isolate 90128 and from P. infestans mycelium grown in liquid rye-sucrose medium. A northern blot containing these samples was hybridized with probes made from P. infestans scr74 and the constitutive gene actA (Unkles et al. 1991). During the interaction, scr74 transcripts were first detected at two dpi and reached maximal levels at 3 and 4 dpi (fig. 1A). In contrast to actA, scr74 resulted in significantly stronger hybridization signals in the time course compared to mycelium, suggesting that the scr74 gene is upregulated during infection of tomato by P. infestans (fig. 1A). Quantification of the hybridization signals indicated that scr74 is upregulated approximately 60-fold at 2 to 4 dpi compared to mycelium.
|
scr74 Belongs To A Polymorphic Gene Family in P. infestans
We used the scr74 sequence to search the SPC database, containing ca. 75,000 ESTs from P. infestans. We identified one scr74-like EST, PH011A12, generated from germinating cysts, a pre-infection stage of P. infestans. Similar to PC015G09, the sequence of the cDNA corresponding to PH011A12 revealed a 222-bp ORF corresponding to a predicted translated product of 74 amino acids. However, the two predicted proteins differed in eight amino-acids. This result prompted us to use Southern hybridization for a preliminary investigation of the copy number of scr74-like sequences among 19 different P. infestans isolates representing 10 US clonal genotypes (Goodwin et al. 1998) (fig. 2). The scr74 probe hybridized to at least 10 different genomic DNA fragments in 12 of 19 isolates (fig. 2). The number of hybridizing bands varied between isolates ranging from two to five, indicating a polymorphic gene family.
|
Multiple alignments of the 17 predicted SCR74 amino acid sequences revealed that eight conserved cysteines define the scr74 gene family signature (fig. 3, the nucleotide multiple alignment is available online in the Supplementary Material). No differences in length (74 amino acids) were observed. Each member of the SCR74 family was predicted to contain a signal peptide (positions 1 to 21), and an extracellular mature protein of 53 amino acids (positions 22 to 74, the last residue of SCR74; fig. 3). Amino acid sequences of the SCR74 family were highly polymorphic. A total of 21 polymorphic amino acid sites were identified. Interestingly, 19 of 21 polymorphic sites were located in the mature protein, whereas only 2 of 21 were in the signal peptide, suggesting that amino acid variation was more frequent in the mature protein than in the signal peptide. Amino acid substitutions involved different chemical classes of amino acids, such as hydrophobic amino acids, including phenylalanine (F), leucine (L), and alanine (A), and charged amino acids, including arginine (R), lysine (K), and histidine (H).
|
|
SCR74 Amino Acid Sites Under Diversifying Selection
To detect the particular amino acid sites under diversifying selection in the scr74 gene family, we applied three pairs of ML models of codon substitution: M3/M0, M8/M7, and M2/M1 (Nielsen and Yang 1998; Yang et al. 2000). The discrete model M3 with three site classes suggested that about 23% of the amino acid sites were under diversifying selection with 1 = 5.542, whereas about 16% of amino acid sites were under strong diversifying selection with
2 = 14.711 (table 2). The LRT for comparing M3 with M0 is 2
L = 2 x [638.61(619.35)] = 38.52, which is greater than the
2 critical value (13.28 at the 1% significance level, with degrees of freedom = 4; table 2). This indicates that the discrete model M3 fits the data significantly better than the neutral model M0, which does not allow for the presence of diversifying selection sites with
> 1. We then used the empirical Bayes theorem to identify 21 amino acid sites implicated as being under diversifying selection with greater than 99% confidence under the discrete model M3 (table 2). We plotted the position of the 21 diversifying selection sites in SCR74 (fig. 5). Interestingly, about 90% (19 of 21) of amino acid sites were located in the mature SCR74 protein, whereas only about 10% (two of 21) of amino acid sites were in the signal peptide. Again, this suggests that sites under diversifying selection occur more frequently in the mature protein of SCR74.
|
|
The selection model M2 did not identify any sites under diversifying selection. The probable reason is that the neutral model M1 failed to account for sites with 0 < < 1 that occur in the SCR74 data set (Yang et al. 2000). Thus, the small proportion of sites with
> 1 in the SCR74 data set were incorrectly added to the class of neutral sites with
= 1 using this model (Yang et al. 2000).
scr74 Genes Are Clustered in the P. infestans Genome
To further investigate gene copy number and organization, scr74 was hybridized to a P. infestans BAC library constructed from strain T30-4 (Whisson et al. 2001). Twelve hybridizing BAC clones were identified. Given that the BAC library was estimated to represent 10-fold genome coverage, 12 hybridizing clones suggest that scr74 is either a single-copy gene or is present as a tightly clustered gene family in this strain. Southern hybridization of scr74 to the 12 clones, restriction digested with HindIII, is presented in figure 6. Three clones, 11G3, 12O12, and 42H10, contained two hybridizing restriction fragments, suggesting the presence of two copies of scr74. Clone 42H10 was partially sequenced at the MIT Broad Institute (GenBank accession number AC147005). A 31,592 bp contig from this sequence was found to contain two copies of scr74 sequence separated by 24,608 bp between ATG start codons. Two additional groups of BAC clones showed single hybridizing restriction fragments in two discrete size classes. These may represent a third and fourth copy of the scr74 gene, or alleles of a third copy. Clone 19M21, containing the larger size class of hybridizing restriction fragment, has also been partially sequenced (83 kb of an estimated total size of 130 kb) at the Broad Institute (accession number AC147508). No sequences similar to scr74 were found in the 19M21 partial sequence. A greater sequence coverage of the clone, or targeted sequencing of regions between existing contigs, may be required to locate this scr74-like gene copy. Clone 64I10 contains the remaining size class of hybridizing restriction fragment. This BAC was sub-cloned and sequenced at SCRI, and a third sequence variant of scr74 was obtained. Southern analysis, BAC end-sequencing, and PCR were used as described previously (Whisson et al. 2001) to show that all 12 BACs are contiguous. These analyses also indicated that at least three copies of the scr74 gene are clustered in a 300-kb region of the P. infestans T30-4 genome (results not shown).
|
|
In addition, we used a window approach to detect likely recombination. We used the DSS method (McGuire and Wright 2000) in the TOPALi program (Milne et al. 2004) with a range of half-window sizes. The only significant result followed splitting the SCR74 alignment into two halves (i.e., half-window of 122 bp) which was significant at P < 0.05 based on a parametric bootstrapping test to compare two tree topologies (Goldman, Anderson, and Rodrigo 2000), but using a DSS statistic rather than Log Likelihood.
We compared phylogenetic trees constructed using the first and second halves of the alignment to see if we could infer recombinants (fig. 7B and 7C). The automatic detection algorithm in TOPALi failed to recover recombinants, presumably because of low signal in the data. However, visual inspection confirmed that the six sequences [(B3, B7); (E6, C4, E5); and C3] lacking evidence of recombination in the SplitsTree diagram appeared to maintain their relative positions, while a considerable number of topology shifts have occurred with respect to the remaining sequences. This pattern is consistent with recombination having been active in at least one of the scr74 loci.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Unfortunately, using the method described by Bos et al. (2003), we could not perform association genetic analyses to assess the likelihood that scr74 is an Avr gene. All isolates examined produced mixed amplicons, and DNA hybridization experiments indicated the probable presence of scr74 paralogs in all isolates. The occurrence of multiple scr74-like sequences in the P. infestans genome confounds association studies because it is not possible to determine unambiguously the array of scr74 sequences of a given isolate by PCR amplification. Detailed characterization of scr74 genomic loci will help to design improved strategies for isolate genotyping.
Evolutionary analyses revealed that the P. infestans scr74 gene family exhibits an unusual pattern of evolution and is likely to be under diversifying selection as detected by both the approximate method of Nei and Gojobori (1986) and the ML method (Nielsen and Yang 1998; Yang et al. 2000; Yang and Bielawski 2000). In most proteins, neutral and purifying selection are thought to be major evolutionary forces, with a high proportion of amino acid sites conserved as a result of structural and functional constraints (Li 1997; Golding and Dean 1998). Under these circumstances, the approximate method should not be sensitive enough to detect diversifying selection because it averages ratios over all sites of the protein (Yang and Bielawski 2000). Nevertheless, using the approximate method, we detected diversifying selection across the entire scr74 sequence. This is likely because of the large number of highly divergent scr74 sequences (Yang and Bielawski 2000) and the relatively small and simple structure of these genes.
Compared to the approximate method, the ML method developed by Yang and collaborators (Yang et al. 2000; Yang and Bielawski 2000) is more sensitive for detecting diversifying selection and can also identify the particular amino acids sites under diversifying selection. We also obtained significant support for diversifying selection in the SCR74 family using two of three models implemented in the ML method. Interestingly, all 21 polymorphic amino acid sites were identified as being under diversifying selection although at different confidence levels.
We also found higher dN to dS ratios in the mature protein region than in the signal peptide region of SCR74. Moreover, 19 of 21 amino acid sites under diversifying selection (90%) are located in the mature protein region. Remarkably, most polymorphic nucleotide sites, 27 out of a total of 32 polymorphisms or a 16.7% polymorphism rate, were in the mature protein region of the ORF. The signal peptide region and the sequenced portion of the UTRs accounted for only four and one polymorphic sites, respectively. The nucleotide polymorphism rates in the signal peptide region and the UTRs were also lower than in the mature protein region at 6.3 % and 1.3 %, respectively. This rapid accumulation of nucleotide changes and amino acid replacements in the mature protein region indicates that diversifying selection has been acting mainly on this portion of the protein. Thus, diversifying selection may have acted on the domain of SCR74 that is directly related to its biological activity, and it may have shaped functional diversity in this protein family. Similar observations have been made for other genes under diversifying selection, such as murine ß-Defensins (Morrison et al. 2003) and toxin genes in the venomous gastropod Conus (Thomas and Stephen 1999, 2000).
How did the scr74 genes evolve to result in divergent, rapidly accumulated nucleotide changes in sequences corresponding to the mature protein but remain relatively conserved in the signal peptide and UTR sequences? Gene duplication, followed by functional divergence of duplicated genes, is an important evolutionary force for the emergence of new gene function (Stephens 1951; Nei 1969; Ohno 1970; Ohta 1980, 1993). Goodman (1976), Goodman et al. (1987) first reported that the rate of amino acid substitutions was accelerated following duplication of hemoglobin genes into and ß hemoglobins and suggested that this acceleration was caused by natural selection. In fact, it has been debatable whether rapid evolution (acceleration in the rate of amino acid substitutions) in gene families following gene duplication occurs by diversifying selection or relaxation of functional constraints due to gene redundancy (Kimura 1983; Li 1985; Ohta 1993, 1994). Based on our analyses, we propose an evolutionary model that involves gene duplication followed by functional divergence of scr74 genes. We found at least three copies of scr74 to be clustered in a region of the P. infestans T30-4 genome of less than 300 kb, suggesting that gene duplication may have occurred. Moreover, we also detected genetic recombination in at least one of the scr74 gene loci. Both diversifying selection and relaxation of selective constraints may have played complementary roles in promoting sequence and functional divergence following gene duplication and recombination. This explanation is consistent with those for the adaptive evolution of primate murine ß-Defensin gene family in vertebrates (Morrison et al. 2003) and pancreatic ribonuclease genes in a leaf-eating monkey (Zhang, Zhang, and Rosenberg 2002). We are currently further analyzing P. infestans genomic clones containing scr74 genes to gain more insights into the role of gene duplication and recombination in the molecular evolution of this family.
What is driving the diversifying selection observed in scr74 genes? The co-evolutionary "arms race" model states that adaptation and counter-adaptation between host and pathogen or parasite drive their antagonistic co-evolution (Dawkins and Krebs 1979). Co-evolution of host-pathogen or host-parasite is thought to generate the evolutionary forces that shape the genes involved in these interactions. Recently, a number of genes involved in host-pathogen antagonistic co-evolution have been revealed to be under diversifying selection, resulting in accelerated amino acid substitutions in sites that determine recognition by the host or the pathogen (Stahl and Bishop 2000). Several toxins and their counteracting detoxifying enzymes, as well as hydrolytic enzymes and their corresponding inhibitors, are thought to be involved in antagonistic co-evolution, and diversifying selection acting on these molecules has been documented (Leckie et al. 1999; Thomas and Stephen 1999, 2000; Bishop, Dean, and Mitchell-Olds 2000; Stotz et al. 2000). For example, several studies showed that diversifying selection has acted on hypervariable solvent-exposed residues of the leucine-rich repeat (LRR) region of some plant disease resistance R proteins from Arabidopsis, lettuce, tomato, rice, and flax (Parniske et al. 1997; Meyers et al. 1998; Wang et al. 1998; Noel et al. 1999; Ellis, Dodds, and Pryor 2000; Mondragon-Palomino et al. 2002). Diversifying selection in R genes is thought to reflect an "arms race" in plant-pathogen co-evolution in order to select novel resistance specificities (Endo, Ikeo, and Gojobori 1996; Stahl and Bishop 2000; Yang 2002). Interestingly, diversifying selection was recently detected in the flax rust Avr genes AvrL567 that are recognized by the flax L5, L6, or L7 R genes, lending additional support for an arms race model of gene-for-gene evolution (Dodds et al. 2004).
Here we show that the scr74 gene family of the oomycete plant pathogen P. infestans is under diversifying selection. Although the nature of the selective pressures remains unclear, we propose that diversifying selection acting on the mature SCR74 protein has resulted in functionally important intraspecific polymorphisms. SCR74 is a secreted protein with similarity to a necrosis-inducing protein and has many hallmarks of an effector protein that may play a role in the infection process. The scr74 genes are also significantly upregulated during infection of host plants. Altogether, this suggests that the selective forces that shaped scr74 evolution might be related to host-pathogen co-evolution. Future functional analyses, such as the in planta expression assays described by Huitema et al. (2004), combined with site-directed mutagenesis, will be necessary to determine the nature and significance of the adaptive changes, as well as to dissect the functional basis of adaptive evolution in scr74.
We did not detect the patterns of polymorphisms and diversifying selection observed in the SCR74 family in elicitins, a well-studied family of secreted cysteine-rich proteins of Phytophthora that has been implicated in host specificity (Kamoun, Lindqvist, and Govers 1997; Kamoun et al. 1997; Qutob et al. 2003). In repeated analyses using the approximate and ML methods, we found no evidence of positive selection in elicitin sequences from P. infestans and Phytophthora sojae (unpublished data). This suggests that distinct selective forces shaped the SCR74 and elicitin families throughout the evolution of Phytophthora. Slow rates of evolution in elicitins are consistent with the view that these proteins are recognized by ancient broad-spectrum plant genes and are implicated in species-level or non-host resistance (Kamoun 2001).
This study provides support for using diversifying selection as an additional selection criterion for candidate effector genes from EST databases, and it therefore augments previously defined criteria, such as secretion and intraspecific polymorphism (Bos et al. 2003; Torto et al. 2003). In the future, accumulation of cDNA and genomic sequences from plant and animal pathogens will yield more opportunities to investigate patterns of diversifying selection in effector gene families. Ultimately, analyses of diversifying selection will help us to establish functional connections between pathogen effectors and host defense processes, and to provide insights into the molecular basis of pathogen-host co-evolution.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:33893402.
Apweiler, R., T. K. Attwood, A. Bairoch et al. (25 co-authors). 2001. The InterPro database, an integrated documentation resource for protein families, domains, and functional sites. Nucleic Acids Res. 29:3740.
Avrova, A. O., E. Venter, P. R. J. Birch, and S. C. Whisson. 2003. Profiling and quantifying differential gene transcription in Phytophthora infestans prior to and during the early stages of potato infection. Fungal Genet. Biol. 40:414.[CrossRef][ISI][Medline]
Baldauf, S. L., A. J. Roger, I. Wenk-Siefert, and W. E. Doolittle. 2000. A kingdom-level phylogeny of eukaryotes based on combined protein data. Science 290:972977.
Bateman, A., E. Birney, L. Cerruti, R. Durbin, L. Etwiller, S. R. Eddy, S. Griffiths-Jones, K. L. Howe, M. Marshall, and E. L. Sonnhammer. 2002. The Pfam protein families database. Nucleic Acids Res. 30:276280.
Berry, A., and M. Kreitman. 1993. Molecular analysis of an allozyme cline: alcohol dehydrogenase in Drosophila melanogaster on the east coast of North America. Genetics 134:869893.
Birch, P. R. J., and S. Whisson. 2001. Phytophthora infestans enters the genomics era. Mol. Plant Pathol. 2:257263.[CrossRef]
Bishop, J. G., A. M. Dean, and T. Mitchell-Olds. 2000. Rapid evolution in plant chitinases: molecular targets of selection in plant-pathogen coevolution. Proc. Natl. Acad. Sci. USA 97:53225327.
Bittner-Eddy, P. D., R. L. Allen, A. P. Rehmany, P. Birch, and J. L. Beynon. 2003. Use of suppression subtractive hybridization to identify downy mildew genes expressed during infection of Arabidopsis thaliana. Mol. Plant Pathol. 4:501507.[CrossRef][ISI]
Bos, J. I. B., M. Armstrong, S. C. Whisson, T. A. Torto, M. Ochwo, P. R. J. Birch, and S. Kamoun. 2003. Intraspecific comparative genomics to identify avirulence genes from Phytophthora. New Phytologist 159:6372.[CrossRef][ISI]
Caten, C. E., and J. L. Jinks. 1968. Spontaneous variability of single isolates of Phytophthora infestans. I. Cultural variation. Can. J. Bot. 46:329347.[ISI]
Dangl, J. L., and J. D. Jones. 2001. Plant pathogens and integrated defence responses to infection. Nature 411:826833.[CrossRef][ISI][Medline]
Dawkins, R., and J. R. Krebs. 1979. Arms races between and within species. Proc. R. Soc. Lond. B Biol. Sci. 205:489511.[ISI][Medline]
Dodds, P. N., G. J. Lawrence, A. M. Catanzariti, M. A. Ayliffe, and J. G. Ellis. 2004. The Melampsora lini AvrL567 avirulence genes are expressed in haustoria and their products are recognized inside plant cells. Plant Cell 16:755768.
Ellis, J., P. Dodds, and T. Pryor. 2000. The generation of plant disease resistance gene specificities. Trends Plant Sci. 5:373379.[CrossRef][ISI][Medline]
Endo, T., K. Ikeo, and T. Gojobori. 1996. Large-scale search for genes on which positive selection may operate. Mol. Biol. Evol. 13:685690.[Abstract]
Ewing, B., and P. Green. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8:186194.
Ewing, B., L. Hillier, M. C. Wendl, and P. Green. 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8:175185.
Fry, W. E., and S. B. Goodwin. 1997a. Re-emergence of potato and tomato late blight in the United States. Plant Dis. 81:13491357.[ISI]
. 1997b. Resurgence of the Irish potato famine fungus. Bioscience 47:363371.[ISI]
Golding, G. B., and A. M. Dean. 1998. The structural basis of molecular adaptation. Mol. Biol. Evol. 15:355369.[Abstract]
Goldman, N., J. P. Anderson, and A. G. Rodrigo. 2000. Likelihood-based tests of topologies in phylogenetics. Syst. Biol. 49:652670.[CrossRef][ISI][Medline]
Goodman, M. 1976. Protein sequences in phylogeny. Pp. 14159 in F. J. Ayala, ed. Molecular evolution. Sinauer Associates, Sunderland, Mass.
Goodman, M., J. Czelusniak, B. F. Koop, D. A. Tagle, and J. L. Slightom. 1987. Globins: a case study in molecular phylogeny. Cold Spring Harbor Symp. Quant. Biol. 52:87590.
Goodwin, S. B., C. D. Smart, R. W. Sandrock, K. L. Deahl, Z. K. Punja, and W. E. Fry. 1998. Genetic change within populations of Phytophthora infestans in the United States and Canada during 1994 to 1996: role of migration and recombination. Phytopathology 88:939949.[ISI]
Haag, E. S., and J. R. True. 2001. Perspective: from mutants to mechanisms? Assessing the candidate gene paradigm in evolutionary biology. Evolution 55:10771084.[ISI][Medline]
Henikoff, J. G., E. A. Greene, S. Pietrokovski, and S. Henikoff. 2000. Increased coverage of protein families with the blocks database servers. Nucleic Acids Res. 28:228230.
Hudson, R. R. 1993. Levels of DNA polymorphism and divergence yield important insights into evolutionary processes. Comment. Proc. Natl. Acad. Sci. USA 90:74257426.
Hughes, A. L., T. Ota, and M. Nei. 1990. Positive Darwinian selection promotes charge profile diversity in the antigen-binding cleft of class I major histocompatibility complex molecules. Mol. Biol. Evol. 7:515524.[Abstract]
Huitema, E., V. G. A. A. Vleeshouwers, D. M. Francis, and S. Kamoun. 2003. Active defence responses associated with non-host resistance of Arabidopsis thaliana to the oomycete pathogen Phytophthora infestans. Mol. Plant Pathol. 4:487500.[CrossRef][ISI]
Huitema, E., J. I. Bos, M. Tian, J. Win, M. E. Waugh, and S. Kamoun. 2004. Linking sequence to phenotype in Phytophthora-plant interactions. Trends Microbiol. 12:193200.[CrossRef][ISI][Medline]
Huson, D. H. 1998. SplitsTree: analyzing and visualizing evolutionary data. Bioinformatics 14:6873.[Abstract]
Ina, Y. 1995. New methods for estimating the numbers of synonymous and nonsynonymous substitutions. J. Mol. Evol. 40:190226.[ISI][Medline]
Kamoun, S., M. Young, C. Glascock, and B. M. Tyler. 1993. Extracellular protein elicitors from Phytophthora: host-specificity and induction of resistance to fungal and bacterial phytopathogens. Mol. Plant-Microbe Interact. 6:1525.[ISI]
Kamoun, S., H. Lindqvist, and F. Govers. 1997. A novel class of elicitin-like genes from Phytophthora infestans. Mol. Plant-Microbe Interact. 10:10281030.[ISI][Medline]
Kamoun, S., P. van West, A. J. de Jong, K. de Groot, V. Vleeshouwers, and F. Govers. 1997. A gene encoding a protein elicitor of Phytophthora infestans is down-regulated during infection of potato. Mol. Plant-Microbe Interact. 10:1320.[ISI][Medline]
Kamoun, S. 2001. Nonhost resistance to Phytophthora: novel prospects for a classical problem. Curr. Opin. Plant Biol. 4:295300.[CrossRef][ISI][Medline]
Kamoun, S. 2003. Molecular genetics of pathogenic oomycetes. Eukaryotic Cell 2:191199.
Karl, S. A., and J. C. Avise. 1992. Balancing selection at allozyme loci in oysters: implications from nuclear RFLPs. Science 256:100102.[ISI][Medline]
Karsch-Mizrachi, I., and B. F. Ouellette. 2001. The GenBank sequence database. Methods Biochem. Anal. 43:4563.[Medline]
Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge, U.K.
Leckie, F., B. Mattei, C. Capodicasa, A. Hemmings, L. Nuss, B. Aracri, G. De Lorenzo, and F. Cervone. 1999. The specificity of polygalacturonase-inhibiting protein (PGIP): a single amino acid substitution in the solvent-exposed ß-strand/ß-turn region of the leucine-rich repeats (LRRs) confers a new recognition capability. EMBO J. 18:23522363.
Letunic, I., L. Goodstadt, N. J. Dickens, T. Doerks, J. Schultz, R. Mott, F. Ciccarelli, R. R. Copley, C. P. Ponting, and P. Bork. 2002. Recent improvements to the SMAT domain-based sequence annotation resource. Nucleic Acids Res. 30:242244.
Li, W. H., C. I. Wu, and C. C. Luo. 1985. A new method for estimating synonymous and nonsynonymous rates on nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 2:150174.[Abstract]
Li, W. H. 1985. Accelerated evolution following gene duplication and its implications for the neutralist-selectionist controversy. Pp. 333352 in Ohta, T., and Aoki, K., eds. Population genetics and molecular evolution. Japan Scientific Societies Press, Tokyo, Japan.
Li, W. H. 1997. Molecular evolution. Sinauer Associates, Sunderland, Mass.
Lu, R., I. Malcuit, P. Moffett, M. T. Ruiz, J. Peart, A. J. Wu, J. P. Rathjen, A. Bendahmane, L. Day, and D. C. Baulcombe. 2004. High throughput virus-induced gene silencing implicates heat shock protein 90 in plant disease resistance. EMBO J. 3:56905699.
Margulis, L., and K. V. Schwartz. 2000. Five kingdoms: an illustrated guide to the phyla of life on earth. W. H. Freeman and Co., New York, N. Y.
McGuire, G., and F. Wright. 2000. TOPAL 2.0: improved detection of mosaic sequences within multiple alignments. Bioinformatics 16:130134.[Abstract]
Meyers, B. C., K. A. Shen, P. Rohani, B. S. Gaut, and R. W. Michelmore. 1998. Receptor-like genes in the major resistance locus of lettuce are subject to divergent selection. Plant Cell 10:18331846.
Milne, I., F. Wright, G. Rowe, D. F. Marshal, D. Husmeier, and G. McGuire. 2004. TOPALi: Software for automatic identification of recombinant sequences within DNA multiple alignments. Bioinformatics 20:18061807.
Mondragon-Palomino, M., B. C. Meyers, R. W. Michelmore, and B. S. Gaut. 2002. Patterns of positive selection in the complete NBS-LRR gene family of Arabidopsis thaliana. Genome Res. 12:13051315.
Morrison, G. M., C. A. M. Semple, F. M. Kilanowski, R. E. Hill, and J. R. Dorin. 2003. Signal sequence conservation and mature peptide divergence within subgroups of the murine ß-Defensin gene family. Mol. Biol. Evol. 20:460470.
Nei, M. 1969. Gene duplication and nucleotide substitution in evolution. Nature 221:4042.[ISI][Medline]
Nei, M., and T. Gojobori. 1986. Simple methods for estimating the number of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418426.[Abstract]
Nicholls, H. 2004. Stopping the rot. PLoS Biol. 2:891895.[CrossRef][ISI]
Nielsen, H., J. Engelbrecht, S. Brunak, and G. von Heijne. 1997. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10:16.[CrossRef][ISI]
Nielsen, H., and A. Krogh. 1998. Prediction of signal peptides and signal anchors by a hidden Markov model. Pp. 122130 in Proceeding of the Sixth International Conference on Intelligent Systems for Molecular Biology (ISMB 6). AAAI Press, Menlo Park, Calif.
Nielsen, R., and Z. Yang. 1998. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929936.
Noel, L., T. L. Moores, E. A. Van Der Biezen, M. Parniske, M. J. Daniels, J. E. Parker, and J. D. Jones. 1999. Pronounced intraspecific haplotype divergence at the RPP5 complex disease resistance locus of Arabidopsis. Plant Cell 11:20992112.
Ohno, S. 1970. Evolution by gene duplication. Springer-Verlag, New York.
Ohta, T. 1980. Evolution and variation of multigene families (Lecture notes in biomathematics, vol. 37). Springer-Verlag, New York.
Ohta, T. 1993. Pattern of nucleotide substitutions in growth hormone-prolactin gene family: a paradigm for evolution by gene duplication. Genetics 134:12711276.
Ohta, T. 1994. Further examples of evolution by gene duplication revealed through DNA sequence comparisons. Genetics 138:13311337.
Orr, H. A., and J. A. Coyne. 1992. The genetics of adaptation: a reassessment. Am. Nat. 140:725742.[CrossRef][ISI]
Orsomando, G., M. Lorenzi, N. Raffaelli, M. D. Rizza, B. Mezzetti, and S. Ruggieri. 2001. Phytotoxic protein PcF, purification, characterization, and cDNA sequencing of a novel hydoxyproline-containing factor secreted by the strawberry pathogen Phytophthora cactorum. J. Biol. Chem. 276:2157821584.
Parniske, M., K. E. Hammond-Kosack, C. Golstein, C. M. Thomas, D. A. Jones, K. Harrison, B. B. Wulff, and J. D. Jones. 1997. Novel disease resistance specificities result from sequence exchange between tandemly repeated genes at the Cf-4/9 locus of tomato. Cell 91:821832.[ISI][Medline]
Qutob, D., E. Huitema, M. Gijzen, and S. Kamoun. 2003. Variation in structure and activity among elicitins from Phytophthora sojae. Mol. Plant Pathol. 4:119124.[CrossRef][ISI]
Ristaino, J. B. 2002. Tracking historic migrations of the Irish potato famine pathogen, Phytophthora infestans. Microbes Infect. 4:13691377.[CrossRef][ISI][Medline]
Sambrook, J., and D. W. Russell. 2001. Molecular cloning. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y.
Schiermeier, Q. 2001. Russia needs help to fend off potato famine, researchers warn. Nature 440:1011.
Shattock, R. C. 2002. Phytophthora infestans: populations, pathogenicity and phenylamides. Pest Manage. Sci. 58:944950.[CrossRef][ISI]
Smart, C. D., and W. E. Fry. 2001. Invasions by the late blight pathogen: renewed sex and enhanced fitness. Biol. Invas. 3:235243.[CrossRef]
Sogin, M. L., and J. D. Silberman. 1998. Evolution of the protests and protistan parasites from the perspective of molecular systematics. Int. J. Parasitol. 28:1120.[CrossRef][ISI][Medline]
Stahl, E. A., and J. G. Bishop. 2000. Plant-pathogen arms races at the molecular level. Curr. Opin. Plant Biol. 3:299304.[CrossRef][ISI][Medline]
Stephens, S. G. 1951. Possible significance of duplication in evolution. Adv. Genet. 4:247265.[ISI]
Stotz, H. U., J. G. Bishop, C. W. Bergmann, M. Koch, P. Albersheim, A. G. Darvill, and J. M. Labavitch. 2000. Identification of target amino acids that affect interactions of fungal polygalacturonases and their plant inhibitors. Physiol. Mol. Plant Pathol. 56:117130.[CrossRef][ISI]
Thomas, F. D., and R. P. Stephen. 1999. Molecular genetics of ecological diversification: duplication and rapid evolution of toxin genes of the venomous gastropod Conus. Proc. Natl. Acad. Sci. USA 96:68206823.
Thomas, F. D., and R. P. Stephen. 2000. Evolutionary diversification of multigene families: allelic selection of toxins in predatory cone snails. Mol. Biol. Evol. 17:12861293.
Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:48764882.
Torto, T. A., S. Li, A. Styer, E. Huitema, A. Testa, N. A. Gow, P. van West, and S. Kamoun. 2003. EST mining and functional expression assays identify extracellular effector proteins from the plant pathogen Phytophthora. Genome Res. 13:16751685.
Unkles, S. E., R. P. Moon, A. R. Hawkins, J. M. Duncan, and J. R. Kinghorn. 1991. Actin in the oomycetous fungus Phytophthora infestans is the product of several genes. Gene 100:105112.[CrossRef][ISI][Medline]
van't Slot, K. A. E., and W. Knogge. 2002. A dual role for microbial pathogen-derived effector proteins in plant disease and resistance. Crit. Rev. Plant Sci. 21:229271.[ISI]
Vleeshouwers, V. G. A. A., W. van Dooijeweert, F. Govers, S. Kamoun, and L. T. Colon. 2000. The hypersensitive response is associated with host and nonhost resistance to Phytophthora infestans. Planta 210:853864.[CrossRef][ISI][Medline]
Wang, G. L., D. L. Ruan, W. Y. Song et al. (12 co-authors). 1998. Xa21D encodes a receptor-like molecule with a leucine-rich repeat domain that determines race-specific recognition and is subject to adaptive evolution. Plant Cell 10:765779.
Whisson, S. C., T. van der Lee, G. J. Bryan, R. Waugh, F. Govers, and P. R. J. Birch. 2001. Physical mapping across an avirulence locus of Phytophthora infestans using a high representative, large insert bacterial artificial chromosome library. Mol. Genet. Genomics 266:289295.[CrossRef][ISI][Medline]
Wu, C. I. 2001. The genic view of the process of speciation. J. Evol. Biol. 14:851865.[CrossRef][ISI]
Yang, Z., and J. P. Bielawski. 2000. Statistical methods for detecting molecular adaptation. Trends Ecol. Evol. 15:496503.[CrossRef][ISI][Medline]
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555556.[Medline]
Yang, Z., R. Nielsen, N. Goldman, and A. M. Pedersen. 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431449.
Yang, Z. 2002. Inference of selection from multiple species alignments. Curr. Opin. Genet. Dev. 12:688694.[CrossRef][ISI][Medline]
Zhang, J., Y. Zhang, and H. Rosenberg. 2002. Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf-eating monkey. Nat. Genet. 30:411415.[CrossRef][ISI][Medline]