Department of Genetics and Evolution, Max Planck Institute of Chemical Ecology, Jena, Germany
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
As a step toward this Arabidopsis-Arabis model system, this paper has three goals: (1) to examine the robustness of inferred phylogenetic relationships using data from two nuclear genes; (2) to estimate the divergence time between A. thaliana and its wild relatives; and (3) to estimate synonymous substitution rates of several genes, which might provide a molecular clock useful for future estimation of divergence times of species and alleles.
The Brassicaceae (Cruciferae, or mustard family) comprises approximately 340 genera and 3,350 species, including the economically important Brassica crops and the model organism A. thaliana. Recently, Koch, Bishop, and Mitchell-Olds (1999) analyzed sequence data from the nuclear ribosomal DNA internal transcribed spacer (ITS) and showed that published phylogenies based on Adh, Adc, and plastidic loci (such as ndhF) are largely congruent with a comprehensive ITS phylogeny of 33 taxa representing major lineages of Arabideae. The genera Arabidopsis and Arabis are both polyphyleticsome distantly related species appear within these taxonomic classifications, which should be revised (Galloway, Malmberg, and Price 1998
; Al-Shehbaz, O'Kane, and Price 1999
).
The oldest fossil evidence of Brassicaceae occurs in Oligocene deposits (2234 MYA; Cronquist 1981
). Yang et al. (1999)
combined sequence data from NADH subunit 4 (nad4) with the estimated divergence time between maize and wheat to infer the age of the Brassica-Arabidopsis divergence at approximately 1420 Myr. Since Brassica and Arabidopsis lineages separated fairly early in crucifer evolution (Koch, Bishop, and Mitchell-Olds 1999
; Koch, Haubold, and Mitchell-Olds 2000
), this provides congruent fossil and molecular evidence suggesting that early crucifer evolution occurred on the order of 30 MYA. Here, we seek independent estimates for species divergence times within the Brassicaceae, especially between A. thaliana and its close relatives.
Such analyses also estimate Ks, the synonymous nucleotide substitution rate. Plausible estimates for the rate of nucleotide substitution can be used to address many questions of evolutionary interest. For example, Ks estimates for Adh have been used to calculate times for speciation and domestication of maize (Hilton and Gaut 1998
), the time for a surge of retrotransposition that doubled the size of the maize genome during the last several million years (SanMiguel et al. 1998
), and the age of molecular polymorphisms at the A. thaliana Adh locus (Innan et al. 1996
). On the other hand, synonymous substitution rates may be heterogeneous within and among loci and may not be neutral (Comeron, Kreitman, and Aguade 1999
; Llopart and Aguade 1999
).
In this study of crucifer evolution, we used two well-known nuclear genes encoding the enzymes chalcone synthase (CHS; EC 2.3.1.74) and alcohol dehydrogenase (ADH; EC 1.1.1.1) as molecular marker loci. Chalcone synthase participates in plant secondary metabolism by catalyzing the condensation of three molecules of malonyl-CoA and one molecule of p-coumaryl-CoA to yield chalcone, a precursor in biosynthesis of flavonoids. In contrast, alcohol dehydrogenase is part of the primary metabolism and catalyzes the reduction of acetaldehyde to ethanol under anoxia. Both genes are members of multigene families in some plant taxa (Gaut and Clegg 1991
; Clegg, Cummings, and Durbin 1997
). However, previous analysis of Adh in Brassicaceae provided little evidence for gene duplication in crucifers, with the exception of the genus Leavenworthia (Charlesworth, Liu, and Zhang 1998
). In the absence of sequence information from closely related taxa, it is not possible to decide whether the gene duplication in Leavenworthia occurred before or after the origin of this taxon. Leavenworthia belongs to subtribe Cardaminae. If the duplication is older than Leavenworthia, it should also be found in related genera from subtribe Cardaminae of tribe Arabideae, e.g., Cardamine, Rorippa, or Nasturtium.
Chs belongs to multigene families in plant taxa such as petunias (Koes, Spelt, and van den Elzen 1989
), Ipomoea species (Durbin et al. 1995
), and legumes (Ryder et al. 1987
; Wingender et al. 1989
; An et al. 1993
; Junghans, Dalkin, and Dixon 1993
; Howles, Arioli, and Weinman 1995
). Different gene copies indicating multiple paralogs for Chs have been reported for Sinapis (Durbin et al. 1995
). However, like its close relatives Brassica and Raphanus (Warwick and Black 1991
), there is strong evidence that Sinapis consists of ancient polyploids (Sadowski et al. 1996
; Cavell et al. 1998
). In A. thaliana, which has a diploid genome without extensive duplication, Chs is thought to be single-copy (Cain et al. 1997
). With only limited gene duplication, both Adh and Chs are candidate loci for construction of gene trees for Arabidopsis and its relatives.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
For each gene, both strands were cycle-sequenced using the Taq DyeDeoxy Terminator Cycle Sequencing Kit (ABI Applied Biosystems). Products of the cycle-sequencing reactions were run on an ABI 377XL automated sequencer (ABI Applied Biosystems). Cloned PCR products were sequenced using universal t7 forward (5'-gtaacgatttaggtgacactatcg-3') and M13-48 reverse (5'-agcggataacaatttcacacagga-3') primers. Additional internal primers were designed for Adh (each in both orientations) (ADH-FOR2 [REV2] [5'-atcaagattctcttcacttc-3'], ADH-FOR3 [REV3] [5'-acatgtgtgatcttctcagg-3'], ADH-FOR4 [REV4] [5'-gttgtggtttatccactggttag-3'], ADH-FOR5 [REV5] [5'-aagaaaggtcaaagtgttgc-3'], and ADH-FOR6 [REV6] [5'-gccatgattcaagcatttgaatg-3']) and for Chs (each in both orientations) (CHS-FOR2 [REV2] [5'-gaccgacctcaaggagaag-3'], CHS-FOR3 [REV3] [5'-cgtggtggtcgaagtccctaagct-3'], and CHS-FOR4 [REV4] [5'-gactggaactccctcttctgga-3']). Approximate primer positions are shown in figure 1 .
Isozyme Electrophoresis
Native acrylamide gel electrophoresis was performed by the method described in Koch, Huthmann, and Hurka (1998)
using a lithium hydroxide-borate acid electrode buffer and a Tris citric acid gel buffer (Scandalios 1969
). Electrophoresis was carried out for 8 h at 75 V. Gels were stained for alcohol dehydrogenase activity as described by Soltis et al. (1983)
.
We used the GCG software package (Wisconsin Package, version 9.1-unix 1997, Genetics Computer Group, Madison, Wis.) to estimate isoelectric points of Adh proteins from DNA sequence information and compared those values with the observed migration rates.
Data Analysis
Phylogenetic Analysis
Introns and promoter regions were removed manually, and the remaining coding sequences were aligned using CLUSTAL V (Higgins, Bleasby, and Fuchs 1992
). The lengths of the resulting alignments were 1,143 bp and 1,188 bp for Adh and Chs, respectively. Phylogenetic distances were computed using Kimura's (1980)
two-parameter model, and the resulting distance matrices were subjected to the neighbor-joining algorithm as implemented in PHYLIP (Felsenstein 1995
). One thousand bootstrap samples were analyzed to assess confidence of nodes on the original neighbor-joining tree. Parsimony analysis was conducted with unordered Fitch parsimony. The analyses were run using PAUP 4.0* beta version (Swofford 1999
) under HEURISTIC, TBR, and STEEPEST DESCENT with random addition of taxa. The bootstrap option of PAUP (1,000 replicates) and a decay analysis (Donoghue et al. 1992
) were used to assess relative support in the unweighted analysis.
Trees were rooted using Aethionema grandiflora as an outgroup in the data sets. The family Brassicaceae is a well-defined family (Schulz 1936
), and previous studies indicate that the genus Aethionema is the sister taxon to the rest of the family (Zunk et al. 1996
; Galloway, Malmberg, and Price 1998
). For the overall Adh analysis, including several database sequences, we used Brassica oleracea as an outgroup because the A. grandiflora Adh sequence was assembled from three cloned genomic fragments and might not represent one single locus.
Adh sequence analysis was carried out in two different ways. First, only those taxa for which Chs sequence data were also available were analyzed. Second, we added all published Adh sequences from crucifers to estimate a comprehensive evolutionary tree (Miyashita et al. 1998
[A. thaliana ecotypes Ci-0 and Landsberg, A. lyrata ssp. kawasakiana, Arabidopsis suecica, Arabidopsis korshinskii, Arabidopsis griffithiana adh1-2, Arabis glabra adh1-2, Arabidopsis wallichii, Arabidopsis himalaica, Arabis hirsuta ASIA, Arabis stelleri]; Charlesworth, Liu, and Zhang 1998
[alleles from Leavenworthia stylosa as representatives for their Adh1, Adh2, and Adh3]; Miyashita, Innan, and Terauchii 1996
[Arabidopsis halleri ssp. gemmifera]).
Estimation of Substitution Rates
Overall substitution rates were computed using Kimura's (1980)
two-parameter model. Rates of synonymous (Ks) and nonsynonymous (Ka) substitutions were calculated according to Li's method (1993)
as implemented in the li93 program (Wolfe 1993
).
In order to obtain the number of mutations per year, the divergence time between C. amara and Barbarea vulgaris was set to 6 x 106 years. Hence, the number of synonymous mutations per year was calculated as µs = Ks(B. vulgaris, C. amara)/(2 x 6 x 10-6), where Ks is the synonymous substitution rate calculated according to Li (1993)
. A 95% confidence interval for µs was computed using the following simulation approach: A random Ks value was drawn from a normal distribution with mean Ks and standard deviation SD(Ks[B. vulgaris, C. amara]), where the standard deviation of Ks was determined according to Li (1993)
. The random Ks value was divided by two times a random variable drawn from a normal distribution with mean 6 x 106 and standard deviation 106/1.96. The latter distribution models our conviction of being 95% sure that the divergence time for B. vulgaris and C. amara is between 5 and 7 Myr, with a mean of 6 Myr. This procedure was repeated 1,000 times, the resulting mutation rates were sorted, and the top and bottom 2.5% of the distribution were removed to obtain a confidence interval for µs. An analogous simulation procedure was used to calculate a confidence interval around the divergence time for a given pair of taxa.
The significance of correlations between matrices of substitution rates or ratios of terminal branch lengths was estimated using the Mantel permutation test (Mantel 1967
). The order of taxa in the input matrices was permuted 10,000 times, and a correlation coefficient was computed for each new comparison. Two-tailed statistical significance was calculated as the frequency of obtaining a correlation coefficient whose absolute value was greater than or equal to the absolute value of the original point estimate (Manly 1994
).
Test for Recombination
Recombination was investigated by the maximum chi-squared method of Maynard Smith (1992)
as implemented by Ross (1997)
. In this approach, one examines the distribution of segregating sites between two putative recombinant haplotypes using a sliding point that partitions the alignment into two regions. The observed distribution of segregating sites is then compared with the distribution expected if the segregating sites were randomly distributed, in order to find the point of maximum discrepancy between random and observed distribution. This represents a putative breakpoint for homologous recombination (Maynard Smith 1992
). Only polymorphisms at third codon positions were included in the analysis.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Of the 28 accessions sequenced, only A. griffithiana yielded two Chs sequences (fig. 2 ). Because A. griffithiana plants were propagated over three generations via single-seed descend, the material we used should be homozygous. Moreover, the stock center from which we obtained seeds of A. griffithiana propagates these plants via selfing. The two distinct sequences differed at 2.00% of their nucleotides. This level of sequence divergence, combined with the fact that A. griffithiana is tetraploid, suggests that these Chs sequences may represent two duplicated loci.
|
Alcohol Dehydrogenase
For Adh, we were not able to design primers located in the 5' promoter region. Therefore, in contrast to Chs, orthology could not be inferred based on promoter similarity. Hence, we inferred whether sequences were likely to be orthologous or paralogous based on breeding system, history of self-pollination, ploidy level, and degree of divergence among sequences. When two sequences are very similar ( < 1.0%), they are likely to represent alleles at a single locus, whereas higher levels of divergence (
> a few percent) typically characterize diverged loci within a gene family. Between these extremes, additional experimentation is required to distinguish allelic polymorphism from locus duplication. Although we found a few examples of species having several moderately diverged sequences, in these instances, a clear distinction between alleles and loci is not central to the research goals of this paper. We refer to such observations as "genes" or "sequences," so as not to imply whether they actually represent alleles or loci.
Arabis alpina harbored the highest number of Adh sequences, with three putative alleles ( = 0.76%) in the two African populations studied (fig. 3
). Among the 10 individuals investigated for population AFRICA2, 3 individuals had the allele Adh1-1 (AF110426; table 1
), and the remaining 7 individuals carried the allele Adh1-2 (AF110427). A third allele was detected among individuals from the European population. The two populations of A. drummondii and of A. petraea that carried different Chs genes also had different Adh sequences (
= 2.46% and 0.97%, respectively).
|
|
Duplicated Adh loci have previously been described by Charlesworth, Liu, and Zhang (1998)
, who found three distinct loci with numerous alleles from different Leavenworthia species. Different Adh loci differed in the number of introns, which is similar to our findings in Arabis. They refer to these loci as Adh1 (all introns present), Adh2 (no intron 4), and Adh3 (missing all introns). In our full Adh analysis, we included one allele as a representative from each locus and found that Adh2 and Adh3 from Leavenworthia differ from Arabis Adh loci (fig. 4
). Therefore, Adh2 and Adh3 of Leavenworthia and Arabis are not orthologous and presumably arose by independent duplication events.
All of the Adh sequences appeared to code for functional proteins, since none contained stop codons or insertions/deletions. In order to further investigate the relationship between nucleotide sequence and enzyme phenotype, we subjected taxa with more than one Adh locus to isozyme electrophoresis. In each case, we detected only a single band. Relative to A. thaliana (Rf = 100), we observed migration rates of Rf = 100 for A. hirsuta EUROPE, Rf = 95 for A. procurrens, Rf = 90 for A. jaquinii, and Rf = 103 for A. blepharophylla (table 2 ). These single bands of Adh activity could indicate that there was only one highly expressed Adh locus or that products from more than one locus comigrated on the gel. We distinguished between these two possibilities by computing the expected migration rates for these two proteins (table 2 ). The difference in net charge between A. hirsuta allele pairs Adh1/Adh2-1 and Adh1/Adh2-2 was 1 and 4, respectively (table 2 ). Since differences in net charge of 1 or more are easily detected with our experimental protocol, activities of enzymes encoded by Adh2-1/Adh2-2 in A. hirsuta must be low or absent. The same argument applies to Adh2 from A. procurrens, which is distinguished from Adh1 by a charge difference of 1.5. For A. blepharophylla, allozyme electrophoresis did not yield additional information about gene expression, as the products from all three loci had the same net charge.
|
|
|
The gene phylogeny based on Adh showed four major groups of taxa: (1) B. oleracea as the most basal group, followed by (2) Arabis pauciflora, followed by the two large crown clades containing (3) A. thaliana and (4) A. alpina (figs. 4 and 6
). In contrast, the gene phylogeny based on Chs contained only two of these major clades, with Sinapis alba and A. pauciflora now constituting a clade with A. thaliana (figs. 2 and 5
). Sinapis alba and Raphanus sativus are very closely related to B. oleracea (Warwick and Black 1991, 1997
). Notice, however, that on both phylogenetic trees the bootstrap support and decay values for the basal groups were low, indicating that the trees were not highly resolved at this level of the analysis.
In addition to the disagreement about the branching order near the root, there was one difference between crown groups. According to Adh, A. deltoidea groups with A. alpina, away from A. jaquinii (fig. 6
). In contrast, according to Chs, A. deltoidea groups with A. jaquinii, away from A. alpina (fig. 5
). This incongruence is well supported by bootstrapping and decay analysis on both trees and may indicate lineage sorting, convergent evolution, or an ancient intergenic recombination event (Syvanen 1994
).
Intergenic recombination might have occurred if A. deltoidea arose by hybridization between the ancestors of A. alpina and other European Arabis species. During constitution of A. deltoidea, intragenic recombination might have occurred between parental Chs copies, giving rise to a recombinant Chs sequence in A. deltoidea. Because only a minor part of the gene (exon 1 with 294 bp, approximately 25% of the entire coding region) resembles A. alpina, whereas the major part (exon 2) resembles progenitors of other European Arabis, A. deltoidea clusters in phylogenetic analysis basal to A. procurrens and its relatives. We investigated this further using the maximum chi-squared method (Maynard Smith 1992
) to search for recombination points between sequences. No breakpoints were found in Adh (data not shown). However, when the Chs sequences for A. deltoidea and A. alpina were compared, a single potential insertion point was found between nucleotides 294 and 297 (P = 0.002). The same putative recombination point was identified by comparisons between either of the two A. alpina genes and members of the A. deltoidea clade (A. deltoidea, A. jaquinii, A. procurrens, A. blepharophylla, and A. hirsuta). Sliding-window analyses of the distribution of substitutions between pairs of Chs sequences from the A. alpina and A. deltoidea clades revealed a region of high diversity at the 3' end of the gene (data not shown).
Biogeography and Genetic Variation
For some populations, the distribution of Adh genes was correlated with geographic origin. For A. alpina, all four sequences grouped to a single clade. It has previously been speculated that African populations of A. alpina have been separated from European populations since the Tertiary (Plantholt 1995
). The Adh1 genes of European Arabis species were distinct from those of Asian and North American taxa, as indicated by high bootstrap support (fig. 6
). This included Adh1 from Asian A. hirsuta, which differed from the Adh1 gene found in the corresponding European population. This taxon might bridge the EuropeanNorth American disjunction via Asia and is reported to be distributed in the United States with numerous morphologically divergent subspecies (Rollins 1941
). No geographic structuring of genetic variation was found for Arabidopsis lyrata from the United States and Asia or A. petraea from Europe. These two taxa are thought to be conspecific (Koch, Bishop, and Mitchell-Olds 1999
), and these Adh sequences were nested among one another (fig. 3
). The intercontinental disjunction of A. blepharophylla in North American versus European Arabis was correlated with the presence of a new Adh3 locus, lacking introns 46, in A. blepharophylla. This intronless Adh gene was not observed in any European Arabis. Adh3 of A. blepharophylla is most closely related to Adh2 from the same species.
Relative Substitution Rates
The mean numbers of substitutions per base were 0.087 for Adh and 0.111 for Chs. If substitutions are a function of time only, the numbers of substitutions in Adh and Chs should be perfectly correlated. In accordance with this expectation, the correlation between substitution matrices for Adh and Chs was 0.87 (P < 10-4). However, gene phylogenies for Adh and Chs also suggest possible heterogeneity of substitution rates (figs. 2 and 3
). Therefore, we tested the null hypothesis of equal rates of evolution. To examine rate heterogeneity between two terminal branches, let
where ij is the ratio of the lengths of two terminal branches, dij is the distance between the ith and the jth taxa, and dio is the distance between the ith taxon and the outgroup (Gaut et al. 1996
), in this case A. grandiflora. We used the method of Muse and Gaut (1994)
to test the null hypothesis that the two terminal branches are of equal length, i.e., that ij = 1. Notice that ij is undefined if dij + dio = djo, i.e., as the terminal branch leading to the jth taxon becomes very short; notice also that ij is negative if dij < |dio - djo|, i.e., if one of the terminal branches has a negative length. Since negative branch lengths are difficult to interpret biologically, we excluded all comparisons that returned negative values of ij in any of the four ij matrices computed for synonymous and nonsynonymous substitutions at Adh and Chs (table 3
).
|
Alcohol Dehydrogenase
At synonymous sites, the relative rates of evolution were quite homogeneous (table 3
), with A. turrita causing all three significant deviations from homogeneity. At synonymous sites in Chs and Adh, only 3 of 156 comparisons of relative synonymous rates were significant. This is <5%; hence, there is no evidence for heterogeneity of synonymous rates. In general, A. turrita and A. glabra seemed to evolve more slowly than the other taxa (table 3
). At Adh nonsynonymous sites, 10 of the 78 comparisons were significant (table 3
). In particular, B. vulgaris had an elevated rate of Adh evolution, while that of A. procurrens was reduced in all comparisons (table 3
).
Correlating Substitution Patterns at the Chs and Adh Loci
The ratios of branch lengths provided a starting point to further probe the evolutionary dynamics at Chs and Adh by testing for correlations between the ij-matrices. If the number of substitutions is a function of mutation rate, then synonymous and nonsynonymous ij-matrices for a given locus should be correlated. This was the case for Adh (r = 0.29, P = 0.03) but not for Chs (r = -0.13, P = 0.34), which may indicate that the substitution process has reached saturation at Chs but not at Adh. In order to test for genomewide evolutionary dynamics, the ij-matrices were compared between loci, but no significant correlations were found (not shown).
How Old Is the Arabis Clade?
Results of the relative-rate test allowed us to focus on a subset of taxa with homogeneous rates of evolution. To estimate divergence times among these taxa, we used information from Pliocene deposits of Rorippa pollen to calibrate the rates of synonymous substitution for Adh and Chs. Rorippa is a close relative of Cardamine and Barbarea (Franzke et al. 1998
). Mai (1995)
cited extensive Rorippa pollen deposits in geological samples from the Pliocene (2.55.0 MYA). It follows that Barbarea and Cardamine diverged before the Pliocene, about 6 MYA (node Z; figs. 2 and 3 ). Assuming that Barbarea and Cardamine diverged 6.0 MYA (node Z), we estimated synonymous substitution rates and their 95% confidence intervals as 1.0 x 10-8 < 1.5 x 10-8 < 2.0 x 10-8 and 9.9 x 10-9 < 1.5 x 10-8 < 2.1 x 10-8 mutations per site per year for Chs and Adh, respectively. Therefore, the last common ancestor of A. thaliana and North American Arabis is between 9.0 and 11.0 Myr old (node A; table 4
). Furthermore, A. thaliana diverged from its closest congeners about 5.15.4 MYA (node B; table 4
), the North American Arabis diverged about 0.82.2 MYA (node C; table 4
), and segregating alleles of Adh in A. thaliana diverged 1.5 MYA based on data from the Landsberg and Columbia ecotypes (node D in fig. 4
; table 4
).
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Age and Substitution Rates
Estimation of substitution rates, species divergence times, and the ages of segregating polymorphisms requires a known time point from the fossil record. Pollen from close relatives of Cardamine and Barbarea is common in geological samples from the Pliocene (2.55.0 MYA; Mai 1995
). Therefore, we assume that Cardamine and Barbarea diverged about 6.0 MYA. This provides estimated substitution rates for Chs and Adh, leading to inferred divergence time for the species in this study. Using independent fossil dating and a different gene, Yang et al. (1999)
obtained very similar dates for the Brassica-Arabidopsis divergence (see below). These similar estimates from independent analyses suggest that our estimates for substitution rates and divergence times are reasonable.
Estimates of divergence times are problematic, as they rely on the assumption of a molecular clock. Even if this assumption is justified, computation of the variance in divergence times is not straightforward (Sanderson 1998
). We tested the hypothesis of uniform evolutionary rates for each pair of taxa and used only those pairs in further analyses where the null hypothesis was not rejected (table 3
). In order to assess the confidence attached to our time estimates, we considered the following three sources of error: (1) uncertainty in the divergence time of the pair of reference taxa, (2) uncertainty in the Ks value of the reference taxa (B. vulgaris and C. amara in this case), and (3) uncertainty in the Ks value of the target taxa. We used a simulation approach to model the influence of all three factors simultaneously on the distribution of the diversity times. This enabled us to provide confidence intervals with our time estimates (table 4
).
We estimated a synonymous substitution rate of 1.5 x 10-8 at Chs. This is slightly higher than the rates of 8 x 10-9 calculated by Durbin et al. (1995)
for Ipomoea and 6 x 10-9 to 9 x 10-9 estimated by Wolfe, Sharp, and Li (1989)
for maize versus barley. Our synonymous substitution rate for Adh (1.5 x 10-8) is also higher than the values found by Morton, Gaut, and Clegg (1996)
for palms (2.6 x 10-9) and by Gaut et al. (1996)
for grasses (7.0 x 10-9). Using information from six nuclear genes, Wolfe, Sharp, and Li (1989)
estimated the average synonymous substitution rate at 5 x 10-9 to 7 x 10-9 substitutions per site per year among members of the grass family. Despite the great uncertainty associated with dating speciation events in the fossil record, the similar dating obtained by Yang et al. (1999)
and that from our analysis suggest that synonymous substitution rates for Adh and Chs in the Brassicaceae exceed estimates obtained from studies in monocots.
From the substitution rates obtained in our study, we attempted to date several important nodes on the gene trees for Chs and Adh (table 4
). We estimate that the last common ancestor of A. thaliana and its nearest relatives occurred approximately 5 MYA, while the most recent common ancestor of polymorphic A. thaliana Adh alleles occurred roughly 1.5 MYA. This contrasts with the finding of Innan et al. (1996)
, who calculated the age of the last common ancestor of the A. thaliana Adh alleles at 6.3 Myr based on an assumed substitution rate of 10-9. This is likely to be an underestimate, because Innan et al. (1996)
included intron sequences in their analysis, whereas we calculated a synonymous substitution rate of 1.5 x 10-8 from our Adh data.
We estimate that the crucifer lineages analyzed in this study (excluding A. grandiflora) originated about 24 MYA (table 4
; node Y in figs. 2 and 3
). This result corresponds to findings obtained from divergence time estimates of the mitochondrial gene for NADH subunit 4 (nad4) among cruciferous plants by Yang et al. (1999)
, who calculated the average distance between the Brassica clade and Arabidopsis to be 1420 Myr. This value broadly overlaps with our estimates for a Brassica-Arabidopsis split (node Y in figs. 2 and 3
; notice that these nodes connect different taxa in the two trees). However, the basal branching pattern is not well supported in either tree, and branches with low bootstrap support should be collapsed. Further rough extrapolation leads to the estimation that the most recent common ancestor of A. grandiflora and the rest of the Brassicaceae is approximately 3060 Myr old (node X in figs. 2 and 3
), which agrees with the oldest findings of Brassicaceae in the Oligocene (2234 MYA; Cronquist 1981
).
Allelic Variation and Gene Duplication
Patterns of diversity differed strongly at Chs and Adh. In diploid taxa, Chs was essentially a single-copy gene, with polyploid A. griffithiana providing the one exception to this rule (fig. 2 ). In contrast, many taxa contained more than one Adh sequence, which differed not only in the level of nucleotide diversity, but also in intron number. This contrasts with the situation in A. thaliana, which only has one Adh locus, but agrees with the observation that many plant species have several Adh loci (e.g., Trick et al. 1988
; Yokoyama and Harry 1993
). The sequence diversity observed at Adh might indicate that (1) polyploidization of taxa such as A. hirsuta took place after speciation of European Arabis, and (2) locus duplication leading to Adh2 in European Arabis occurred prior to or during the evolution of this group, because Adh1 and Adh2 are similar among the taxa studied (fig. 4 ).
With regard to intron number, Adh2 lacked all introns in A. procurrens and European A. hirsuta. The first observation of a plant Adh locus devoid of all introns was recently made in Leavenworthia, which contains three Adh loci that differ in number of introns; Adh3 has no introns and Adh2 has lost intron 4 (Charlesworth, Liu, and Zhang 1998
). Intron-free copies of genes that usually contain introns have probably arisen through reverse transcription involving an mRNA intermediate (Charlesworth, Liu, and Zhang 1998
). Since the closest relatives of Leavenworthia, C. amara and B. vulgaris, only have a single copy of Adh, the extra gene copies in Leavenworthia probably arose after the origin of this genus. All five species of Leavenworthia studied by Charlesworth, Liu, and Zhang (1998)
have three Adh loci, whereas we found three loci solely in A. blepharophylla.
Charlesworth, Liu, and Zhang (1998)
established by acetate gel electrophoresis that the intron-free copy of Adh in Leavenworthia codes for functional protein. We investigated the expression of Adh through polyacrylamide gel electrophoresis and found that only Adh1 has detectable enzyme activity in A. hirsuta and A. procurrens. If the intron-free Adh2 were a pseudogene, it would presumably accumulate deleterious mutations quickly, including internal stop codons. The fact that this has not occurred suggests that Adh2 in A. hirsuta and A. procurrens may be expressed, albeit at levels below the sensitivity of our electrophoresis setup or in tissues or environmental conditions that were not assayed here.
Comparison Between Genes
We analyzed rates of evolution at more than one locus because comparison between rates can give insights into the evolutionary process. For example, the generation time hypothesis posits that species with short generations will have higher substitution rates than species with longer generations. This effect should influence multiple loci. Selection, on the other hand, often acts on individual loci and may cause heterogeneity of evolutionary rates among loci. Within taxa, we found no significant correlations between loci for either synonymous or nonsynonymous substitutions, indicating that species-specific evolutionary factors do not influence the evolution of these two genes.
The inferred phylogenetic position of A. deltoidea differed according to whether Chs or Adh was used for the input data (figs. 2 and 3
). In principle, incongruencies between gene trees may be due to lineage sorting, recombination, or convergent evolution. Sorting of ancient polymorphism into different lineages is an unlikely explanation because of the nonrandom distribution of polymorphisms within the Chs sequences. Recombination and convergent evolution are distinguished by positing neutrality in the case of recombination and selection in the case of convergent evolution. In order to minimize the effect of selection on our analysis, we considered only the distribution of substitutions in third codon positions in the maximum -squared test (Maynard Smith 1992
). We still found significant deviation from randomness in the distribution of the substitutions, which was best explained by recombination or gene conversion.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Present address: Institute of Botany, Department of Systematic Botany and Geobotany, University of Agricultural Science, Vienna, Austria.
1 Keywords: Adh,
Arabidopsis,
Arabis,
Chs,
divergence time
phylogeny.
2 Address for correspondence and reprints: Thomas Mitchell-Olds, Department of Genetics and Evolution, Max Planck Institute of Chemical Ecology, Carl-Zeiss-Promenade 10, 07745 Jena, Germany. E-mail: tmo{at}ice.mpg.de
![]() |
literature cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Al-Shehbaz, I. A., S. L. O'Kane, and R. A. Price. 1999. Generic placement of species excluded from Arabidopsis (Brassicaceae). Novon 9:296307
An, C., Y. Ichinose, T. Yamada, Y. Tanaka, T. Shiraishi, and H. Oku. 1993. Structure and organization of the genes encoding chalcone synthase in Pisum sativum. Plant Mol. Biol. 21:789803
Bevan, M., and G. Murphy. 1999. The small, the large and the wild: the value of comparison in plant genomics. Trends Genet. 15:211214[ISI][Medline]
Cain, C. C., D. E. Saslowsky, R. A. Walker, and B. W. Shirley. 1997. Expression of chalcone synthase and chalcone isomerase proteins in Arabidopsis seedlings. Plant Mol. Biol. 35:377381[ISI][Medline]
Cavell, A. C., D. J. Lydiate, I. A. P. Parkin, C. Dean, and M. Trick. 1998. Collinearity between a 30-centimorgan segment of Arabidopsis thaliana chromosome 4 and duplicated regions within the Brassica napus genome. Genome 41:6269
Charlesworth, D., F. L. Liu, and L. Zhang. 1998. The evolution of the alcohol dehydrogenase gene family by loss of introns in plants of the genus Leavenworthia (Brassicaceae). Mol. Biol. Evol. 15:552559[Abstract]
Clegg, M. T., M. P. Cummings, and M. L. Durbin. 1997. The evolution of plant nuclear genes. Proc. Natl. Acad. Sci. USA 94:77917798
Comeron, J. M., M. Kreitman, and M. Aguade. 1999. Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics 151:239249
Cronquist, A. 1981. An integrated system of classification of flowering plants. Columbia University Press, New York
Donoghue, M. J., R. G. Olmstead, J. F. Smith, and J. D. Palmer. 1992. Phylogenetic relationships of Dipsacales based on rbcL sequences. Ann. Mo. Bot. Gard. 79:333345
Durbin, M. L., G. H. Learn, G. A. Huttley, and M. T. Clegg. 1995. Evolution of the chalcone synthase gene family in the genus Ipomoea. Proc. Natl. Acad. Sci. USA 92:33383342
Feldbrügge, M., M. Sprenger, K. Hahlbrock, and B. Weisshaar. 1997. PcMYB1, a novel plant protein containing a DNA-binding domain with one MYB repeat, interacts in vivo with a light-regulatory promoter unit. Plant J. 11:10791093[ISI][Medline]
Felsenstein, J. 1995. PHYLIP (phylogeny inference package). Version 3.57c. Distributed by the author, Department of Genetics, University of Washington, Seattle
Franzke, A., K. Pollmann, W. Bleeker, R. Kohrt, and H. Hurka. 1998. Molecular systematics of Cardamine and allied genera (Brassicaceae): ITS and noncoding chloroplast DNA. Folia Geobot. 33:225240[ISI]
Galloway, G. L., R. L. Malmberg, and R. A. Price. 1998. Phylogenetic utility of the nuclear gene arginine decarboxylase: an example from Brassicaceae. Mol. Biol. Evol. 15:13121320
Gaut, B. S., and M. T. Clegg. 1991. Molecular evolution of alcohol dehydrogenase 1 in members of the grass family. Proc. Natl. Acad. Sci. USA 88:20602064
Gaut, B. S., B. R. Morton, B. C. McCaig, and M. T. Clegg. 1996. Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. Proc. Natl. Acad. Sci. USA 93:1027410279
Higgins, D. G., A. J. Bleasby, and R. Fuchs. 1992. CLUSTAL V: improved software for multiple sequence alignment. Comput. Appl. Biosci. 8:189191[Abstract]
Hilton, H., and B. S. Gaut. 1998. Speciation and domestication in maize and its wild relativesevidence from the Globulin-1 gene. Genetics 150:863872
Howles, P. A., T. Arioli, and J. J. Weinman. 1995. Nucleotide sequence of additional members of the gene family encoding chalcone synthase in Trifolium subterraneum. Plant Physiol. 107:10351036
Innan, H., F. Tajima, R. Terauchii, and T. Miyashita. 1996. Intragenic recombination in the Adh locus of the wild plant Arabidopsis thaliana. Genetics 143:17611770
Junghans, H., K. Dalkin, and R. A. Dixon. 1993. Stress response in alfalfa (Medicago sativa L.) 15. Characterization and expression patterns of members of a subset of the chalcone synthase multigene family. Plant Mol. Biol. 22:239253[ISI][Medline]
Karkkainen, K., H. Kuittinen, R. van Treuren, C. Vogl, S. Oikarinen, and O. Savolainen. 1999. Genetic basis of inbreeding depression in Arabis petraea. Evolution 53:13541365
Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111120[ISI][Medline]
Koch, M., J. Bishop, and T. Mitchell-Olds. 1999. Molecular systematics and evolution of Arabidopsis and Arabis. Plant Biol. 1:529537
Koch, M., B. Haubold, and T. Mitchell-Olds. 2000. Molecular systematics of the Brassicaceae: evidence from coding plastidic matK and nuclear CHS sequences. Am. J. Bot. (in press)
Koch, M., M. Huthmann, and H. Hurka. 1998. Isozymes, speciation and evolution in the polyploid Cochlearia L. (Brassicaceae). Bot. Acta 111:411425
Koes, R. R., C. E. Spelt, and P. J. M. van den Elzen. 1989. Cloning and molecular characterization of the chalcone synthase multigene family of Petunia hybrida. Gene 81:245257
Li, W.-H. 1993. Unbiased estimation of the rates of synonymous and nonsynonymous substitution. Mol. Evol. 36:9699[ISI][Medline]
Llopart, A., and M. Aguade. 1999. Synonymous rates at the RpII215 gene of Drosophila: variation among species and across the coding region. Genetics 152:269280
Mai, D. H. 1995. Tertiäre Vegetationsgeschichte Europas. Fischer, Jena, Stuttgart, New York
Manly, B. F. J. 1994. Multivariate statistical methods: a primer. 2nd edition. Chapman and Hall, London
Mantel, N. 1967. The detection of disease clustering and a generalized regression approach. Cancer Res. 27:209220[ISI][Medline]
Manton, J. 1937. The problem of Biscutella laevigata L. II. Evidence from meiosis. Ann. Bot. 1:439462
Maynard Smith, J. 1992. Analysing the mosaic structure of genes. J. Mol. Evol. 34:126129[ISI][Medline]
Miyashita, N. T., H. Innan, and R. Terauchi. 1996. Intra- and interspecific variation of the alcohol dehydrogenase locus region in wild plants Arabis gemmifera and Arabidopsis thaliana. Mol. Biol. Evol. 13:433436
Miyashita, N. T., A. Kawabe, H. Innan, and R. Terauchi. 1998. Intra- and interspecific DNA variation and codon bias of the alcohol dehydrogenase (Adh) locus in Arabis and Arabidopsis species. Mol. Biol. Evol. 15:14201429
Morton, B. R., B. S. Gaut, and M. T. Clegg. 1996. Evolution of alcohol dehydrogenase genes in the palm and grass families. Proc. Natl. Acad. Sci. USA 93:1173511739
Mummenhoff, K., and M. Koch. 1994. Chloroplast DNA restriction site variation and phylogenetic relationships in the genus Thlaspi sensu lato (Brassicaceae). Syst. Bot. 19:7388[ISI]
Muse, S. V., and B. S. Gaut. 1994. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol. Biol. Evol. 11:715724
Plantholt, U. 1995. Molekulare Untersuchungen zur Arealgeschichte von Arabis alpina L. (Brassicaceae). Ph.D. thesis, University of Osnabrück, Germany
Price, R. A., J. D. Palmer, and I. A. Al-Shehbaz. 1994. Systematic relationships of Arabidopsis: a molecular and morphological perspective. Pp. 719 in E. M. Meyerowitz and C. R. Somerville, eds. Arabidopsis. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y
Rollins, R. C. 1941. A monographic study of Arabis in western North America. Rhodora 43:289325
Ross, N. 1997. Maximum chi-squared. Distributed by Brian Spratt, Department of Zoology, Oxford University, Oxford, England
Ryder, T. B., S. A. Hedrick, J. N. Bell, X. Liang, S. D. Clouse, and C. J. Lamb. 1987. Organization and differential activation of a gene family encoding the plant defense enzyme chalcone synthase in Phaseolus vulgaris. Mol. Gen. Genet. 210:219233
Sadowski, J., P. Gaubier, M. Delseny, and C. F. Quiros. 1996. Genetic and physical mapping in Brassica diploid species of a gene cluster defined in Arabidopsis thaliana. Mol. Gen. Genet. 251:298306
Sanderson, M. J. 1998. Estimating rate and time in molecular phylogenies: beyond the molecular clock? Pp. 242264 in P. S. Soltis, D. E. Soltis, and J. J. Doyle, eds. Molecular systematics of plants. 2nd edition. Chapman and Hall, New York
SanMiguel, P., B. S. Gaut, A. Tikhonov, Y. Nakajima, and J. L. Bennetzen. 1998. The paleontology of intergene retrotransposons of maize. Nat. Genet. 20:4345[ISI][Medline]
Scandalios, J. G. 1969. Genetic control of multiple molecular forms of enzymes in plants: a review. Biochem. Genet. 3:3779[ISI]
Schulz, O. E. 1936. Cruciferae. Pp. 227658 in A. Engler and K. Prantl, eds. Die natürlichen Pflanzenfamilien. Vol. 17b. Engelmann, Leipzig, Germany
Soltis, D. E., C. H. Haufler, D. C. Darrow, and G. J. Gastony. 1983. Starch gel electrophoresis of ferns: a compilation of grinding buffers, gel and electrode buffers, and staining schedules. Am. Fern J. 73:927[ISI]
Somerville, C., and S. Somerville. 1999. Plant functional genomics. Science 285:380383
Swofford, D. L. 1999. PAUP: phylogenetic analysis using parsimony. Version 4.0b2. Sinauer, Sunderland, Mass
Syvanen, M. 1994. Horizontal gene transfer: evidence and possible consequences. Annu. Rev. Genet. 28:237261[ISI][Medline]
Titz, W. 1970. Zur Cytotaxonomie von Arabis hirsuta agg. (Cruciferae). V. Artifizielle und natürliche F1-Hybriden sowie deren Cytogenetik. Öster. Bot. Z. 118:353390
. 1976. Cytosystematic study of the Iberian taxa of the Arabis hirsuta group. Feddes Repertorium 87:493502
. 1978. Experimentelle Systematik und Genetik der kahlen Sippen in der Arabis hirsuta-Gruppe (Brassicaceae). Bot. Jahrb. Syst. 100:110139
Trick, M., E. S. Dennis, K. J. R. Edwards, and W. J. Peacock. 1988. Molecular analysis of the alcohol dehydrogenase gene family of barley. Plant Mol. Biol. 11:147160[ISI]
van Treuren, R., H. Kuittinen, K. Karkkainen, E. Baena-Gonzalez, and O. Savolainen. 1997. Evolution of microsatellites in Arabis petraea and Arabis lyrata, outcrossing relatives of Arabidopsis thaliana. Mol. Biol. Evol. 14:220229[Abstract]
Warwick, S. I., and L. D. Black. 1991. Molecular systematics of Brassica and allied genera (subtribe Brassicinae, Brassiceae)chloroplast genome and cytodeme congruence. Theor. Appl. Genet. 82:8192[ISI]
. 1997. Phylogenetic implications of chloroplast DNA restriction site variation in subtribe Raphaninae and Cakilinae (Brassicaceae, tribe Brassiceae). Can. J. Bot. 75:960973[ISI]
Wingender, R., H. Röhrig, C. Höricke, D. Wing, and J. Schell. 1989. Differential regulation of soybean chalcone synthase genes in plant defense, symbiosis and upon environmental stimuli. Mol. Gen. Genet. 218:315322[ISI][Medline]
Wolfe, K. H. 1993. Software program li93. University of Dublin, ftp://acer.gen.tcd.ie/pub/khwolfe/li93
Wolfe, K. H., P. M. Sharp, and W.-H. Li. 1989. Rates of synonymous substitution in plant nuclear genes. J. Mol. Evol. 29:208211[ISI]
Yang, Y.-W., K. N. Lai, P.-Y. Tai, and W.-H. Li. 1999. Rates of nucleotide substitution in angiosperm mitochondrial DNA sequences and dates of divergence between Brassica and other angiosperm lineages. J. Mol. Evol. 48:597604[ISI][Medline]
Yokoyama, S., and D. E. Harry. 1993. Molecular phylogeny and evolutionary rates of alcohol dehydrogenases in vertebrates and plants. Mol. Biol. Evol. 10:12151226[Abstract]
Zunk, K., K. Mummenhoff, M. Koch, and H. Hurka. 1996. Phylogenetic relationships of Thlaspi s. l. (subtribe Thlaspidinae, Lepidieae) and allied genera based on chloroplast DNA restriction-site variation. Theor. Appl. Genet. 92:375381