* Department of Botany, Iowa State University
Pacific Northwest Research Station, USDA Forest Service
Institute of Genetics and Cytology, Northeast Normal University, Changchun, China
Plant Genome Mapping Laboratory, University of Georgia, Riverbend Research Center
|| Department of EPO Biology, University of Colorado
¶ Department of Plant Sciences, University of Arizona
# Department of Agronomy and Range Science, University of California-Davis
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: Gossypium cotton polyploidy molecular clock substitution rates evolution
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Gossypium L. contains 50 species whose phylogenetic relationships have been explored using multiple molecular data sets (Seelanan, Schnabel, and Wendel 1997; Small et al. 1998; Cronn et al. 2002b). Data indicate that shortly after its origin, Gossypium experienced rapid divergence (Cronn et al. 2002b), leading to modern monophyletic lineages that vary in chromosome size and interfertility (so-called "genome groups" A through G and K). There are five natural polyploids in the genus, which apparently spawned from a single polyploidization event 1 to 2 MYA (Cronn et al. 1996; Small et al. 1998; Wendel and Cronn 2003). All are "AD" genome tetraploids, combining an A-genome donated by the maternal diploid parent at the time of polyploid formation and a D-genome from the pollen parent (Galau and Wilkins 1989; Wendel 1989; Wendel and Cronn 2003). Among extant species, G. herbaceum L. and G. arboreum L. are the closest relatives of the A-genome progenitor, and G. raimondii Ulbrich is the best model of the D-genome progenitor (reviewed in Wendel and Cronn 2003). These two genome groups diverged from each other early in the evolution of the genus, perhaps 7 to 11 MYA (Seelanan, Schnabel, and Wendel 1997; Cronn et al. 2002b). Notably, the A-genome is about twice the size of the D-genome (2C = 3.8 pg versus 2.0 pg), and these size differences are perpetuated in the natural polyploids, which exhibit an additive genome size (Wendel et al. 2002 and references therein). Economically important polyploid species include G. barbadense L. (Sea Island and Pima cotton) and G. hirsutum L. (upland cotton).
Because of the absence of a fossil record, divergence times in Gossypium have been estimated primarily using molecular clock assumptions. These estimates have been based either on thermal denaturation-renaturation studies or on a small number of genes (Endrizzi, Katterman, and Geever 1989; Wendel 1989; Seelanan, Schnabel, and Wendel 1997; Cronn, Small, and Wendel 1999; but see Cronn et al. 2002b). Our objective here was to examine the extent of nuclear gene rate variation among genes and lineages in Gossypium and to use the resulting data set to generate a clearer understanding of the temporal components of the evolutionary history of Gossypium. We also wished to explore patterns of gene evolution in polyploid cotton, using as a comparative framework orthologs from the diploid progenitors. Previous studies (Cronn, Small, and Wendel 1999; Small and Wendel 2000) suggested that rates of sequence evolution may be enhanced in allopolyploid Gossypium relative to its diploid progenitors, although this pattern was difficult to statistically verify due to the relative recency of polyploid formation. Perhaps a more extensive sampling of genes would provide additional power to test the intriguing hypothesis that polyploidization leads to accelerated molecular evolutionary rates.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
Plant Taxa
Species chosen for this study were based on the organismal framework provided in figure 1. Gossypium raimondii (unnamed accession) was chosen because it is the best living model of the D-genome donor (Wendel and Cronn 2003). The two A-genome diploids (G. arboreum and G. herbaceum) are phylogenetically equidistant from the A-genome progenitor of polyploid cotton and are thus interchangeable for the present purposes; in most cases G. herbaceum (GenBank accession numbers A173) was used, but G. arboreum (GenBank accession numbers A247) was substituted when amplification difficulties were encountered. To represent allopolyploid cotton, either G. barbadense Pima S6 or G. hirsutum TM1 were used, depending on the bacterial artificial chromosomes (BAC) library used for sequence determination. To root phylogenetic trees and for purposes of providing reference sequences for relative rate tests, we included Gossypioides kirkii (Mast.) J. B. Hutchinson or Kokia kauiensis (Rock) O. Deg. and Duvel, representatives of sister genera that have been shown by phylogenetic analyses (Seelanan, Schnabel, and Wendel 1997; Wendel et al. 2002) to comprise the closest living relatives of Gossypium L. DNAs were isolated from young leaves using previously described protocols (Cedroni et al. 2002; Cronn, Small, and Wendel 1999; Cronn et al. 2002b; Wendel et al. 2002).
|
For each gene studied, allopolyploid species contain two homoeologous sequences, representing descendants of those contributed by the A-genome and D-genome donors at the time of polyploid formation. To isolate both homoeologs, we used one of two approaches. In one, heterogeneous PCR products were cloned after amplification from genomic DNA and the two duplicates were identified by restriction site analysis. Alternatively, homoeologs were isolated individually by PCR off of BAC clones derived from either G. hirsutum cv. Maxxa (Tomkins et al. 2001) or G. barbadense cv. Pima S6 (A. Paterson, unpublished data). Since each BAC clone contained only one of the two homoeologs, this latter strategy proved effective in minimizing problems of in vitro PCR recombination (Cronn et al. 2002a). BAC DNA was isolated from 50 ml cultures using the Psi-Clone Big BAC DNA Extraction Kit (Princeton Separations, Inc).
Automated sequencing was conducted using the ABI Big Dye v. 2.0 fluorescent primers and ABI Prism 377-3700 system at the Iowa State DNA Sequencing and Synthesis Facility. GenBank numbers for all sequences, aligned lengths for each gene, and putatively protein functions are listed in table 1.
Data Analysis
Sequences were aligned using BioEdit (Hall 1999) v. 5.0.9 (http://www.mbio.ncsu.edu/BioEdit/bioedit.html), and the resulting alignments were adjusted manually. DnaSP v. 3.53 (Rozas and Rozas 1999) was used to estimate G+C content and substitutions per site for synonymous (Ks), silent (Ksil, including both synonymous and noncoding sites), and replacement (Ka) sites. Phylogenetic analyses and Kimura two-parameter estimates of genetic distance were obtained using PAUP* (Swofford 1998). To evaluate possible rate heterogeneity among Gossypium lineages, relative rate tests were performed using the 1D tests of Tajima (1993). Analysis of variance was performed to determine whether lineage-specific estimates of divergence showed significant associations with genomes (A or D; fixed effect), ploidy levels (2X or 4X; fixed effect), or loci (random effect). For these analyses, evolutionarily inferred apomorphies for terminal lineages (A and D diploid, At and Dt tetraploid) were transformed into the number of inferred substitutions per kb of sequence. Subsequent generalized linear model analyses (PROC GLM in SAS v. 8.0; SAS Institute, Cary, N. C.) utilized untransformed values. Data sets that lacked outgroups (A1834 and AdhE), included known or suspected pseudogenes (AdhC), or exhibited questionable orthology (Myb4 and E11) were omitted from this analysis. For this reason, the number of genes included in analysis of variance was 43, rather than the 48 used in other computations.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
On average, G+C content was higher in coding (0.452 ± 0.045) than in noncoding regions (0.329 ± 0.039). G+C values for second (0.415 ± 0.078) and third (0.416 ± 0.078) codon positions were similar.
Gene Tree Topologies
Because diploid cottons diverged from each other recently relative to the scale of molecular evolution for single-copy nuclear genes, homoplasy was expected to be low. This was indeed the case for all 48 genes, with consistency indices ranging from 0.92 to 1.00, with a mean of 0.99 across the 48 gene trees. Accordingly, parsimony analysis yielded unambiguously resolved topologies that are congruent with the well-documented (Cronn et al. 2002b; Wendel and Cronn 2003) phylogeny of Gossypium shown in figure 1. One test provided by these 48 gene trees is that of gene conversion or recombination between homoeologs, which would be expected to lead either to elevated homoplasy or an altered phylogenetic topology. As shown previously for a smaller set of genes (Cronn, Small, and Wendel 1999), there was no evidence in the 48 genes studied here for these forms of interlocus interaction.
A composite phylogenetic tree with summed branch lengths representing the total number of inferred substitutions (silent and replacement) is shown in figure 2. Summed across branches, 816 substitutions are inferred to have occurred in the gene regions sampled since divergence of the A and D diploids from their common ancestor, with a slight acceleration in the polyploid genomes (total = 867). Similarly, in both the A-genome and the D-genome branches, more substitutions were observed in the polyploid genome than in orthologs from the corresponding diploid (123 versus 99 for the A-genome; 181 versus 154 for the D-genome). As expected from previous data (reviewed in Wendel and Cronn 2003), G. herbaceum and G. arboreum are closer models of the A-genome ancestor than G. raimondii is of the D-genome donor; viz., branch lengths are shorter in the A-genome (total of 222 substitutions) than in the D-genome (335 substitutions) clade (fig. 2).
|
|
To explore in more detail the nature of substitutions contributing to overall divergence in Gossypium, we tabulated levels of synonymous (Ks), replacement (Ka), and silent (Ksil) substitutions and Ka:Ksil ratios (table 3). As expected from the organismal phylogeny, comparisons between Gossypium and the phylogenetic outgroup yielded the highest Ks and Ka values in all cases (see also table 2). Average Ks and Ka values are higher in the A-D and At-Dt comparisons than in the A-At and D-Dt comparisons, as we would expect because the A-D divergence occurred well before the formation of the polyploids. Also, sequence divergence was lower in A-At comparisons than in D-Dt, as noted above and as evidenced by the summary in figure 2. Sequence divergence amounts between the parental species (A and D) and their respective genomes in the polyploid (At and Dt) are not evidently different, although this is not easily statistically tested since so much of their evolutionary history is shared (the divergence estimates are not independent). Replacement substitutions are lower in all pairings compared with corresponding Ks and Ksil estimates.
|
For each gene, estimates of K2P were calculated between the phylogenetic outgroup and each Gossypium genome. Divergences between each ingroup sequence and that of the outgroup were similar in all cases, as expected under conditions of rate homogeneity and the phylogeny provided in figure 1. This comparison provides an informal relative rate test, suggesting that there is no rate variation among the Gossypium taxa studied (as indicated also by formal relative rate tests; see below).
Given the range in molecular evolutionary rates among genes studied, it was of interest to explore the distribution of rates among genes and estimate a mean rate for both diploid and allopolyploid cotton. K2P values for all 48 genes visually appeared to be approximately normally distributed for both the A-D and the At-Dt comparisons (not shown), but Shapiro-Wilk tests of normality were marginally rejected at the 0.05 level. Mean divergences for diploid versus polyploid sequences were similar (0.0223 ± 0.0015 versus 0.0236 ± 0.0014), as were standard deviations around these means (0.011 versus 0.010).
Rate Variation Among Gossypium Lineages
To explore whether rate heterogeneity existed among Gossypium lineages for any of the 48 genes, we used the Tajima relative rate test (Tajima 1993). In nearly all cases, the resulting values were not significant, indicating approximate rate equivalence among lineages for each gene. Only four instances of significant (0.5 < P < 0.01) rate heterogeneity were indicated among the nearly 200 tests, as expected by chance alone. These were accelerated rates for the At homoeolog of C7, the D ortholog of G3, the A ortholog of CLK1, and the D ortholog of GhMYB4. Of these exceptions, C7 is perhaps the most interesting due to the number of replacements (28 changes in 839 nt) and the relatively high Ka:Ksil ratio (table 1). We note that the two branches leading to A/At and D/Dt were long considering the aligned length, suggesting that either C7 is a relatively fast-evolving gene or that we inadvertently isolated paralogs from one of these two clades.
One implication of rate homogeneity is that the evolutionary rate for each gene in table 1 is a property of that gene rather than the particular Gossypium lineage to which it belongs. To examine this suggestion further, we calculated the correlation between synonymous substitution rates in A-At and D-Dt comparisons. Since these are independent divergences, a high correlation is expected only if molecular evolutionary rate reflects inherent properties of the gene and/or its genomic context. Pearson's correlation coefficient for these two sets of divergences, using Ks for the entire gene sequence (or just the coding region when only exons were available) was calculated to be 0.98, indicating a strong correlation.
Although relative rate tests and mean K2P divergence levels indicated that orthologous genes in diploid Gossypium accumulate nucleotide substitutions at approximately the same rate as their counterparts in the allopolyploids, the branch lengths of figure 2 suggest that there may be a slight elevation in molecular evolutionary rate in the allopolyploid. As noted above, in both the A-genome and D-genome branches, more substitutions were observed in the polyploid genome (At and Dt) than in orthologs from the corresponding diploid (123 versus 99 for the A-genome;181 versus 154 for the D-genome). An analysis of variance showed both a locus (P = 0.003) and a genome (D faster than A; P = 0.001) effect but failed to detect a significant effect due to ploidy level (P = 0.16). However, when the data were tabulated on a gene-by-gene basis, there were 41 instances in which the number of substitutions was higher in the branch leading to the polyploid than to the diploid genome, whereas the reverse was true in only 26 cases (in 31 cases they were equal). Under the hypothesis of equal rates in polyploids and diploids, this difference may be interpreted as marginally significant (41 versus 26 with the expectation that these numbers would be equal [2 = 3.36; 0.10 > P > 0.05]).
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A second noteworthy aspect of the Gossypium data is that silent substitution rates vary widely among genes. The range of silent substitution amounts among the 48 genes varies ninefold, from 0.018 to 0.162, although if the highest two and lowest two values are excluded, the range narrows considerably, from 0.035 to 0.101 (table 1). Thus, 90% of the values fall within a threefold range, which is remarkably similar to the 2.6-fold range recently reported for 242 gene pairs in Arabidopsis (Zhang, Vision, and Gaut 2002). Our data contribute to and exemplify the increasingly common reports of rate variation in a diverse assemblage of organisms (Wolfe, Sharp, and Li 1989; Moriyama and Gojobori 1992; Wolfe and Sharp 1993; Collins and Jukes 1994; Moriyama and Powell 1996; Zeng et al. 1998; Kusumi et al. 2002; Tiffin and Hahn 2002; Zhang, Vision, and Gaut 2002). These examples, drawn from across the phylogenetic spectrum, illustrate the generality that intergenic rate variation is a fundamental feature of complex eukaryotic genomes.
A third notable aspect of the Gossypium data concerns the ratio of replacement to silent substitutions. Mean Ka:Ksil ratios range from 0.271 to 0.455 for various intergenomic comparisons (table 3), with the lowest ratio corresponding to the greatest evolutionary distance (ingroup-outgroup) and the highest ratios corresponding to the smallest evolutionary distance (A-At and D-Dt). This suggestion of a relationship between degree of divergence and Ka:Ksil ratios is consistent with the notion that some proportion of amino acid substitutions are relatively neutral, vis-à-vis protein function, and hence nucleotide substitutions that cause these changes are neutral or near-neutral. These sites might be expected to behave more like silent sites in terms of evolutionary rates, but they also would become saturated more rapidly than other replacement sites. Hence, as evolutionary distance increases, the near-neutral replacement sites approach saturation, but purifying selection on nonneutral sites continues to retard accumulation of replacements. The net effect is that Ka:Ksil ratios become smaller as evolutionary distance increases (or, alternatively, that Ksil:Ka ratios become larger). This proposal is consistent with other studies involving more divergent taxa than those studied here, although the correlation with time may not be particularly tight due to the numerous other features that potentially influence molecular evolutionary rate. For anciently duplicated Arabidopsis genes, for example, the Ks:Ka ratio is 5 (Zhang, Vision, and Gaut 2002), whereas for rat and mouse the ratio is approximately 7 (Wolfe and Sharp 1993). Among genera of Cupressaceae, which may have diverged approximately 100 MYA, silent substitutions outnumber replacements by sevenfold to eightfold (Kusumi et al. 2002), with a similar ratio for 218 orthologs from Arabidopsis-Brassica, representing perhaps 35 Myr of divergence (Tiffin and Hahn 2002).
Phylogenetic Analysis and Independent Evolution of Homoeologs
Parsimony analysis for each of the 48 genes led to the recovery of the topology expected from our phylogenetic understanding of Gossypium (fig. 2). These results confirm and extend those of an earlier analysis of duplicate gene evolution in allopolyploid Gossypium (Cronn, Small, and Wendel 1999), which demonstrated that homoeologs in polyploid cotton evolve independently of one another in the allopolyploid nucleus. This stands in contrast to at least some repetitive DNAs, which experience postpolyploidization interlocus homogenization (Wendel et al. 1995) as a consequence of one or more processes of concerted evolution. Our results lend additional weight to the suggestion that intergenomic interactions between duplicated single-copy genes is uncommon in allopolyploids (Cronn, Small, and Wendel 1999; Wendel 2000). A similar conclusion was recently reached for 242 gene pairs duplicated by ancient polyploidy in Arabidopsis, where tests failed to provide evidence of gene conversion for any of the duplicates (Zhang, Vision, and Gaut 2002).
Rates of Gene Evolution in Diploid and Allopolyploid Gossypium
As shown in table 2, the mean K2P divergence of orthologs between A-genome and D-genome diploid cottons for 48 single-copy nuclear genes is 0.022, a value nearly identical to that obtained for the same genes isolated from the two descendent genomes (At and Dt) in allopolyploid cotton (0.024). Also, analysis of variance failed to detect an effect of ploidy on rates of sequence evolution. Thus, based on 40,000 nucleotides per taxon, these analyses suggest equivalent rates of genic evolution in diploid and allopolyploid cotton. We note, however, that the total number of substitutions in branches leading to the polyploid genomes (At and Dt) are approximately 20% higher than those of their diploid counterparts (fig. 2) and that a marginally significant (0.10 > P > 0.05) effect of polyploidy on nucleotide substitutions was revealed by analysis of branch lengths on a gene-by-gene basis. These data lead to the suggestion that polyploidy in Gossypium has been accompanied by a modest rate enhancement, as also suggested in an earlier study (Cronn, Small, and Wendel 1999). Perhaps it is not surprising that the rate acceleration is difficult to detect statistically, as polyploid Gossypium formed relatively recently, and so branch lengths are small and subject to proportionately high stochastic variation.
Although there exist no comparable surveys of homoeologous gene evolution in other plant polyploids, a number of studies have demonstrated dramatic genetic and epigenetic changes immediately after polyploidy in some plant groups (Song et al. 1995; Feldman et al. 1997; Liu, Vega, and Feldman 1998; Liu et al. 1998; Comai et al. 2000; Ozkan, Levy, and Feldman 2001; Shaked et al. 2001; Kashkush, Feldman, and Levy 2002). The present study complements other recent analyses of polyploid genome evolution in Gossypium (Brubaker, Paterson, and Wendel 1999; Cronn, Small, and Wendel 1999; Liu et al. 2001) in showing that polyploidy is not accompanied by rapid genome change. In this context we detected no cases of gene loss or gene conversion, and rate evolutionary rate enhancements, if real, are only modest. Similar relative genomic stasis has been reported for the young allopolyploid grass, Spartina anglica (Baumel, Ainouche, and Levasseur 2001), and Brassica juncea (contra Song et al. 1995; Axelsson et al. 2000). As suggested here for Gossypium, it will be of interest to explore whether silent substitution rates for single-copy nuclear genes are elevated relative to their diploid progenitors in allopolyploid systems that are subject to genomic instability, such as in Aegilops/Triticum and Brassica.
Modern Diploids and the Ancestors of Polyploid Cotton
Ever since the discovery that tetraploid Gossypium species contain two different genomes, investigators have attempted to address the question of parentage; that is, which of the modern species of A-genome and D-genome diploids best serve as models of the progenitor genome donors? Over the decades, a diverse array of tools have addressed this question (reviewed in Wendel and Cronn 2003), collectively demonstrating that the best extant models of the ancestral genome donors are G. arboreum and G. herbaceum (A-genome) and G. raimondii (D-genome). Cytogenetic and segregation data suggested that the A-genome of allopolyploid cotton is more similar to that of the A-genome diploids than the D-genome of the allopolyploid is to that of the D-genome diploids. For example, in synthetic allohexaploids formed between diploid and allopolyploid cotton, multivalent frequencies are higher and genetic segregation more closely approximates autotetraploid ratios for A-genome chromosomes than for D-genome chromosomes (Gerstel and Phillips 1958; Phillips 1964). Subsequent data from many sources has confirmed this observation (Wendel and Cronn 2003). Cronn, Small, and Wendel (1999) quantified these relationships using 14,705 nt of sequence information for 16 nuclear loci isolated from the D-genome diploid G. raimondii, the A-genome diploid G. arboreum (or G. herbaceum), and the AD-genome tetraploid G. hirsutum, much as in the present study but with a smaller sampling of genes. Sequence divergence between the diploids and their corresponding genomes in the allopolyploid were 0.68% and 1.05% for the A-genomes and D-genomes, respectively. The present study confirms and extends this understanding: Kimura two-parameter genetic distances between A and At and between D and Dt are 0.007 and 0.010, respectively (table 2), and this same quantitative relationship is captured in the branch lengths of figure 2 (222 versus 335 total substitutions distinguishing the two genomes in the A and D clades, respectively). Thus, G. arboreum and G. herbaceum may be thought of as an approximately 50% better model of the progenitor A-genome diploid than G. raimondii is of the D-genome diploid.
Gene Evolution, the Molecular Clock, and the Age of Polyploidy in Gossypium
Abundant evidence establishes that the five species of tetraploid cottons are allopolyploids containing one genome similar to those found in the Old World, A-genome diploids, and a second genome like those of the New World, D-genome diploids (reviewed in Wendel and Cronn 2003). Because the two parental genome groups exist in diploid species that presently occupy different hemispheres, the question of how and when allopolyploid cotton formed has stimulated discussion for more than 50 years. Some authors have suggested that Gossypium had an ancient, perhaps Cretaceous origin, due to its global distribution and high level of cytogenetic and morphological diversity, whereas others have invoked a origin of allopolyploids in agricultural times, forwarding a scenario that involved human transfer of an African or Asiatic A-genome cultigen to the New World, followed by deliberate or accidental hybridization with a wild D-genome species. These speculations and others, which encompass proposals ranging from a Cretaceous (60 to 100 MYA) to a recent (6,000 years ago) origin, are discussed at length in Wendel and Cronn (2003).
DNA sequence data have uniformly supported the view that allopolyploid Gossypium originated prior to the evolution of modern humans but relatively recently in geological terms, perhaps during the Pleistocene 1 to 2 MYA (Wendel 1989; Seelanan, Schnabel, and Wendel 1997; Small et al. 1998; Cronn, Small, and Wendel 1999). Cronn, Small, and Wendel (1999), in a study of 16 low-copy nuclear sequences, reported that mean sequence divergence between the diploids and their counterparts in the allopolyploid averaged 0.68% and 1.05%, respectively, for the A-genome and D-genome comparisons. Similar values (Ksil = 0.7% and 1.1%, respectively; Ks = 0.9% and 1.1%, respectively [table 3]) were obtained in the present study using a mean rate based on three times as many genes. Relative rate tests revealed no evidence of lineage-specific effects, so this source of potential error in molecular clock applications is minimized.
This leaves clock calibration as the most troublesome source of error in estimating divergence dates. Although rates of synonymous site evolution have been estimated for several plants using a small sample of genes (2.6 x 10-9 - 1.5 x10-8 substitutions/synonymous site/year [Morton, Gaut, and Clegg 1996; Gaut 1998; Koch, Haubold, and Mitchell-Olds 2000]), little is known about the general utility of these estimates. To the extent that they are applicable, and given that generation time is negatively correlated with molecular evolutionary rates (Gaut 1998) and that wild Gossypium species are long-lived perennials, it is likely that the more appropriate end of the spectrum to use is the slower rates. To estimate the age of allopolyploid formation, the values for the A-genome listed above are the most relevant, noting that these data will provide the maximum age of Gossypium allopolyploids. This is because modern A-genome diploid cottons may not be the direct descendants of the actual genome donors. Instead, we only know that they are the closest living model of the ancestral diploid implicated in allopolyploid formation.
Using the formula T = K/2r, where K equals divergence amount (Ksil, Ks from table 3) and r corresponds to the rate of divergence for nuclear genes from plants (2.6 x 10-9 substitutions/site/year), we estimate that allopolyploids formed 1.3 to 1.7 MYA, depending on whether silent (synonymous plus noncoding) or just synonymous sites are used in the calculation. Hence, it seems probable that Gossypium allopolyploids formed in the Mid-Pleistocene, circa 1 to 2 MYA, as suggested by other authors using different criteria (Wendel and Cronn 2003). Extending the analysis to the diploids, we estimate that the A-genome and D-genome lineages diverged from one another 6.0 to 7.3 MYA and that Gossypium last shared a common ancestor with its closest relatives (Gossypioides and Kokia) 11.3 to 14.2 MYA. Thus, the two diploid genomes, A and D, experienced approximately 5 Myr of evolution in isolation from one another prior to their reunion at the time of polyploid formation during the Pleistocene.
Several sources of error remain unaccounted for in the foregoing calculations, and clearly clock calibration remains an important consideration. For example, if the less likely (in our opinion) faster rate estimates reported above (1.5 x 10-8 substitutions/synonymous site/year) are used, polyploid formation may be estimated to have occurred as recently as 230,000 to 300,000 years ago. Using 48 genes, however, and establishing rate homogeneity in the taxa under study, lends a degree of confidence to the interpretation offered that polyploid formation is of mid-Pleistocene age. Additional insight into the accuracy of this inference will require more data on absolute rates of synonymous site divergence, which remain the largest single source of possible error.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Brandon Gaut, Associate Editor
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Axelsson, T., C. M. Bowman, A. G. Sharpe, D. J. Lydiate, and U. Lagercrantz. 2000. Amphidiploid Brassica juncea contains conserved progenitor genomes. Genome 43:679-688.[CrossRef][ISI][Medline]
Baumel, A., M. L. Ainouche, and J. E. Levasseur. 2001. Molecular investigations in populations of Spartina anglica C. E. Hubbard (Poaceae) invading coastal Brittany (France). Mol. Ecol. 10:1689-1701.[CrossRef][ISI][Medline]
Begun, D. J., and C. F. Aquadro. 1992. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356:519-520.[CrossRef][ISI][Medline]
Brubaker, C. L., A. H. Paterson, and J. F. Wendel. 1999. Comparative genetic mapping of allotetraploid cotton and its diploid progenitors. Genome 42:184-203.[CrossRef][ISI]
Brubaker, C. L., and J. F. Wendel. 1994. Reevaluating the origin of domesticated cotton (Gossypium hirsutum; Malvaceae) using nuclear restriction fragment length polymorphisms (RFLPs). Am. J. Bot. 81:1309-1326.[ISI]
Cedroni, M. L., R. C. Cronn, K. L. Adams, T. A. Wilkins, and J. F. Wendel. 2002. Evolution and expression of MYB genes in diploid and polyploid cotton. Plant Mol. Biol 51:313-325.
Collins, D. W., and T. H. Jukes. 1994. Rates of transition and transversion in coding sequences since the human-rodent divergence. Genomics 20:386-396.[CrossRef][ISI][Medline]
Comai, L., A. P. Tyagi, K. Winter, R. Holmes-Davis, S. H. Reynolds, Y. Stevens, and B. Byers. 2000. Phenotypic instability and rapid gene silencing in newly formed Arabidopsis allotetraploids. Plant Cell 12:1551-1567.
Cronn, R. C., M. Cedroni, T. Haselkorn, C. Osborne, and J. F. Wendel. 2002a. PCR-mediated recombination in amplification products derived from polyploid cotton. Theor. Appl. Genet. 104:482-489.[CrossRef][ISI][Medline]
Cronn, R. C., R. L. Small, T. Haselkorn, and J. F. Wendel. 2002b. Rapid diversification of the cotton genus (Gossypium: Malvaceae) revealed by analysis of sixteen nuclear and chloroplast genes. Am. J. Bot. 89:707-725.
Cronn, R. C., R. L. Small, and J. F. Wendel. 1999. Duplicated genes evolve independently after polyploid formation in cotton. Proc. Natl. Acad. Sci. USA 96:14406-14411.
Cronn, R. C., X. Zhao, A. H. Paterson, and J. F. Wendel. 1996. Polymorphism and concerted evolution in a tandemly repeated gene family: 5S ribosomal DNA in diploid and allopolyploid cottons. J. Mol. Evol. 42:685-705.[ISI][Medline]
Endrizzi, J. E., F. R. H. Katterman, and R. F. Geever. 1989. DNA hybridization and time of origin of three species of Gossypium. Evol. Trends Plants 3:553-559.
Feldman, M., B. Liu, G. Segal, S. Abbo, A. A. Levy, and J. M. Vega. 1997. Rapid elimination of low-copy DNA sequences in polyploid wheat: a possible mechanism for differentiation of homoeologous chromosomes. Genetics 147:1381-1387.
Galau, G. A., and T. A. Wilkins. 1989. Alloplasmic male sterility in AD allotetraploid Gossypium hirsutum upon replacement of its resident A cytoplasm with that of the D species G. harknessii. Theor. Appl. Genet. 78:23-30.[ISI]
Gaut, B. S. 1998. Molecular clocks and nucleotide substitution rates in higher plants. Pp. 93120 in M. K. Hecht, ed. Evolutionary biology. Plenum Press, New York.
Gerstel, D. U., and L. L. Phillips. 1958. Segregation of synthetic amphiploids in Gossypium and Nicotiana. Cold Spring Harbor Symp. Quant. Biol. 23:225-237.[ISI][Medline]
Hall, T. A. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41:95-98.
Hillis, D. M., C. Moritz, and B. K. Mable. 1996. Molecular systematics. 2nd edition. Sinauer Associates, Sunderland, Mass.
Kashkush, K., M. Feldman, and A. A. Levy. 2002. Gene loss, silencing, and activation in a newly synthesized wheat allotetraploid. Genetics 160:1651-1659.
Koch, M. A., B. Haubold, and T. Mitchell-Olds. 2000. Comparative evolutionary analysis of chalcone synthase and alcohol dehydrogenase in Arabidopsis, Arabis, and related genera (Brassicaceae). Mol. Biol. Evol. 17:1483-1498.
Kusumi, J., Y. Tsumura, H. Yoshimaru, and H. Tachida. 2002. Molecular evolution of nuclear genes in Cupressaceae, a group of conifer trees. Mol. Biol. Evol. 19:736-747.
Lercher, M. J., and L. D. Hurst. 2002. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18:337-340.[CrossRef][ISI][Medline]
Liu, B., C. L. Brubaker, G. Mergeai, R. C. Cronn, and J. F. Wendel. 2001. Polyploid formation in cotton is not accompanied by rapid genomic changes. Genome 44:321-330.[CrossRef][ISI][Medline]
Liu, B., J. M. Vega, and M. Feldman. 1998. Rapid genomic changes in newly synthesized amphiploids of Triticum and Aegilops. II. Changes in low-copy coding DNA sequences. Genome 41:535-542.[CrossRef][ISI][Medline]
Liu, B., J. M. Vega, G. Segal, S. Abbo, M. Rodova, and M. Feldman. 1998. Rapid genomic changes in newly synthesized amphiploids of Triticum and Aegilops. I. Changes in low-copy noncoding DNA sequences. Genome 41:272-277.[CrossRef][ISI]
Loguercio, L. L., J.-Q. Zhang, and T. A. Wilkins. 1999. Differential regulation of six novel MYB-domain genes defines two distinct expression patterns in allotetraploid cotton (Gossypium hirsutum L.). Mol. Gen. Genet. 261:660-671.[CrossRef][ISI][Medline]
Moriyama, E. N., and J. R. Powell. 1996. Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261-277.[Abstract]
Morton, B. R., B. S. Gaut, and M. T. Clegg. 1996. Evolution of alcohol dehydrogenase genes in the palm and grass families. Proc. Natl. Acad. Sci. USA 93:11735-11739.
Ozkan, H., A. A. Levy, and M. Feldman. 2001. Allopolyploidy-induced rapid genome evolution in the wheat (Aegilops-Triticum) group. Plant Cell 13:1735-1747.
Phillips, L. L. 1964. Segregation in new allopolyploids of Gossypium. V. Multivalent formation in New World x Asiatic and New World x wild American hexaploids. Am. J. Bot. 51:324-329.[ISI]
Rozas, J., and R. Rozas. 1999. DnaSP version 3: an integrated program for molecular genetics and molecular evolution analysis. Bioinformatics 15:174-175.
Sanderson, M. 1998. Estimating rate and time in molecular phylogenies: beyond the molecular clock? Pp. 242264 in P. S. Soltis, D. E. Soltis, and J. J. Doyle, eds. Molecular systematics of plants. Kluwer, Boston.
Seelanan, T., A. Schnabel, and J. F. Wendel. 1997. Congruence and consensus in the cotton tribe. Syst. Bot. 22:259-290.[ISI]
Shaked, H., K. Kashkush, H. Ozkan, M. Feldman, and A. A. Levy. 2001. Sequence elimination and cytosine methylation are rapid and reproducible responses of the genome to wide hybridization and allopolyploidy in wheat. Plant Cell 13:1749-1759.
Small, R. L., J. A. Ryburn, R. C. Cronn, T. Seelanan, and J. F. Wendel. 1998. The tortoise and the hare: choosing between noncoding plastome and nuclear Adh sequences for phylogeny reconstruction in a recently diverged group. Am. J. Bot. 85:1301-1315.
Small, R. L., and J. F. Wendel. 2000. Copy number lability and evolutionary dynamics of the Adh gene family in diploid and tetraploid cotton (Gossypium). Genetics 155:1913-1926.
Soltis, P. S., D. E. Soltis, V. Savolainen, P. R. Crane, and T. G. Barraclough. 2002. Rate heterogeneity among lineages of tracheophytes: integration of molecular and fossil data and evidence for molecular living fossils. Proc. Natl. Acad. Sci. USA 99:4430-4435.
Song, K., P. Lu, K. Tang, and T. C. Osborn. 1995. Rapid genome change in synthetic polyploids of Brassica and its implications for polyploid evolution. Proc. Natl. Acad. Sci. USA 92:7719-7723.[Abstract]
Sorhannus, U., and M. Fox. 1999. Synonymous and nonsynonymous substitution rates in diatoms: a comparison between chloroplast and nuclear genes. J Mol Evol 48:209-212.[ISI][Medline]
Stephan, W., and C. H. Langley. 1998. DNA polymorphism in Lycopersicon and crossing-over per physical length. Genetics 150:1585-1593.
Swofford, D. L. 1998. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0. Sinauer Associates, Sunderland, Mass.
Tajima, F. 1993. Simple methods for testing the molecular evolutionary clock hypothesis. Genetics 135:599-607.
Ticher, A., and D. Grauer. 1989. Nucleic acid composition, codon usage, and the rate of synonymous substitution in protein-coding genes. J. Mol. Evol. 28:286-298.[ISI][Medline]
Tiffin, P., and M. W. Hahn. 2002. Coding sequence divergence between two closely related plant species: Arabidopsis thaliana and Brassica rapa ssp. pekinensis. J. Mol. Evol. 54:746-753.[CrossRef][ISI][Medline]
Tomkins, J. P., D. G. Peterson, T. J. Yang, D. Main, T. A. Wilkins, A. H. Paterson, and R. A. Wing. 2001. Development of genomic resources for cotton (Gossypium hirsutum L.): BAC library construction, preliminary STC analysis, and identification of clones associated with fiber development. Mol. Breed. 8:255-261.[CrossRef][ISI]
Wendel, J. F. 1989. New World tetraploid cottons contain Old World cytoplasm. Proc. Natl. Acad. Sci. USA 86:4132-4136.[Abstract]
Wendel, J. F. 2000. Genome evolution in polyploids. Plant Mol. Biol. 42:225-249.[CrossRef][ISI][Medline]
Wendel, J. F., and R. C. Cronn. 2003. Polyploidy and the evolutionary history of cotton. Adv. Agron 78:139-186.
Wendel, J. F., R. C. Cronn, J. S. Johnston, and H. J. Price. 2002. Feast and famine in plant genomes. Genetica 115:36-47.
Wendel, J. F., A. Schnabel, and T. Seelanan. 1995. Bidirectional interlocus concerted evolution following allopolyploid speciation in cotton (Gossypium). Proc. Natl. Acad. Sci USA 92:280-284.[Abstract]
Wilkins, T. A., and J. A. Jernstedt. 1999. Molecular genetics of developing cotton fibers. Pp. 231267 in A. S. Basra, ed. Cotton fibers. Haworth Press, New York.
Williams, E. J., and L. D. Hurst. 2000. The proteins of linked genes evolve at similar rates. Nature 407:900-903.[CrossRef][ISI][Medline]
Wolfe, K. H., and P. M. Sharp. 1993. Mammalian gene evolution: nucleotide sequence divergence between mouse and rat. J. Mol. Evol. 37:441-456.[ISI][Medline]
Wolfe, K. H., P. M. Sharp, and W.-H. Li. 1989. Rates of synonymous substitution in plant nuclear genes. J. Mol. Evol. 29:208-211.[ISI]
Yang, Z., and J. P. Bielawski. 2000. Statistical methods for detecting molecular adaptation. Trends Ecol. Evol. 15:496-503.[CrossRef][ISI][Medline]
Zeng, L.-W., J. M. Comeron, B. Chen, and M. Kreitman. 1998. The molecular clock revisited: the rate of synonymous versus replacement change in Drosophila. Genetica 102/103:369-382.[CrossRef]
Zhang, L., T. J. Vision, and B. S. Gaut. 2002. Patterns of nucleotide substitution among simultaneously duplicated gene pairs in Arabidopsis thaliana. Mol. Biol. Evol. 19:1464-1473.