Hopkins Marine Station of Stanford University, Pacific Grove, California
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: Vertebrate genome evolution GC content codon bias temperature alpha-actin lactate dehydrogenase-A (ldh-a)
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To account for the differences in GC content of the isochores found in genomes of endothermic and ectothermic species, it has been hypothesized that the higher GC contents in avian and mammalian genomes reflect selection for higher thermal stability of DNA and RNA (Bernardi 2000a, 2000b). The additional hydrogen bond found in GC pairs relative to AT pairs would give the DNA and RNA of birds and mammals a higher thermodynamic stability, thus reducing chances for disruption of structure at high body temperatures. Conversely, lower GC levels in cold-adapted ectotherms may reduce the energy required to "melt" double-stranded DNA for transcription and replication. After Eyre-Walker and Hurst (2001), we refer to this hypothesis as the thermostability hypothesis. Tests of this hypothesis have involved two different types of strategy (for review, see Bernardi 2000a, 2000b). First, comparisons of whole genomes using cesium chloride gradients have generally supported the view that genomes of ectothermic vertebrates lack the asymmetry of modal buoyant densities that are indicative of the small, GC-rich isochores found in mammals and birds. Second, the GC contents of homologous genes (total %GC of the gene, third position GC content [GC3], and GC content of fourfold degenerate sites [GC4]) have been compared. These comparisons have typically focused on only a few species for which large numbers of genes have been sequenced. Bernardi and Bernardi (1991) took a broader comparative approach by examining genes of 19 species of fish, five species of amphibians, five species of reptiles, and humans. The human genes more frequently had a higher GC3 content. However, only for one gene was there more than one ectothermic sequence to compare against the human sequence, and no explicit comparisons were made among the ectotherms with respect to their different adaptation temperatures. Perrin and Bernardi (1987) examined 21 genes in pairwise comparisons between ectotherms and endotherms, but only a single gene in their analysis had sequences available for more than one ectothermic species. Nonetheless, their results showed that 12 of the 21 genes showed an increase in GC3 content in endotherms, whereas six genes showed no difference and three showed higher GC3 in ectotherms. As was noted in a recent review by Eyre-Walker and Hurst (2001), these previous studies have failed to examine the precise relationship between body temperature and genomic base compositiona central tenet of the thermostability hypothesis.
The present study was designed to redress certain limitations of these earlier studies by examining the sequences of the genes encoding two widely occurring housekeeping proteins, lactate dehydrogenase-A (ldh-a) and alpha-actin (-actin), in a large number of vertebrates with widely different adaptation temperatures. These two genes in particular were chosen for the availability of sequence data from a diverse array of vertebrate species with a broad range of physiological temperatures. We reasoned that if the thermostability hypothesis is generally correct, then the GC contents of homologous genes of vertebrates should vary in a regular manner with adaptation temperature, independently of the taxonomic grouping (endotherm vs. ectotherm) of the species. For example, ectothermic vertebrates with body temperatures equal to or higher than those of mammals and birds would be expected to have GC contents similar to, if not higher than, those of endotherms. To maximize the power of our comparisons, we selected 51 species that represent the full range of vertebrate body temperatures, from -1.86°C in cold-adapted Antarctic notothenioid fishes to approximately 45°C in a thermophilic desert reptile. Our results fail to support the thermostability hypothesis, and they call into question certain interpretations given in previous analyses of interspecific variation in GC content.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Nucleotide Composition
All vertebrate sequences were aligned and then base composition was determined for each sequence using the DNAStar suite of programs (DNAStar Inc.). From this analysis the overall GC content and the GC content at each of the three codon positions (GC1, GC2, and GC3) and the fourfold degenerate sites (GC4) were calculated. For statistical correlations, mean values of GC content were taken from: (1) the four sub-Antarctic species, (2) the 13 Antarctic species, and (3) species for which there were multiple full-length coding sequences available for the same species. This was done to minimize skewing the data from extreme cases of nonindependence due to phylogenetic relatedness that, we suggest, likely represent a single adaptation event to a common environment. Additionally, independent contrasts controlling for phylogenetic nonindependence (Felsenstein 1985) were calculated for all species for body temperature versus GC3 and GC4 for both ldh-a and -actin using the CAIC freeware package (Purvis and Rambaut 1995). Phylogenies were generated from DNA sequence data in the first and second codon positions using PAUP* 4.0 software program (Swofford 1998). Phylogenies were reconstructed using a neighbor-joining method (Saitou and Nei 1987) under a minimum evolution (distance) criterion using Tamura-Nei corrected divergence estimates (Tamura and Nei 1993). In general, the phylogeny based on the first two positions of the
-actin gene was poorly supported for the 13 different vertebrate orders represented, whereas the ldh-a phylogeny had better bootstrap support that was in agreement with overall patterns of vertebrate evolution (with the exception of Xenopus). These phylogenies, along with bootstrap support from 500 replicates, were used in the independent contrast analyses and are available in the online supplementary materials.
Body Temperature
Body temperatures for organisms were taken from the literature, if available, or were assigned on the basis of habitat range and behavioral adaptation in the case of ectotherms (table 1; see online supplementary materials for complete references). To correct for the large changes in body temperature that may be experienced by several of the ectotherms in this study, we took into account the upper, mean (or preferred), and lower thermal limits of each species. Lethally high and low temperatures were not used in compiling body temperature data; rather the low and high temperatures reflect the ambient temperature ranges experienced by these species on an annual basis.
|
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Figure 1 plots GC4 of the ldh-a (panel A) and -actin (panel C) genes as functions of mean body temperature. Independent contrasts analyses (mean body temperature versus GC4) are also shown for each gene (panel B: ldh-a; panel D:
-actin). For
-actin, the slopes of the regression lines do not differ significantly from zero (r2 = 0.111, P = 0.111; independent contrast: r2 = 0.00153, P = 0.846). For ldh-a, the slope of the regression line describing body temperature and GC4 without correction for phylogenetic effects is significantly different from zero. In contradiction to the thermostability hypothesis, the slope it is negative rather than positive (r2 = 0.381, P = 0.0022). However, when these data are corrected for phylogenetic relatedness, no significant trend is found (r2 = 0.0240, P = 0.352).
|
Note that in the case of -actin, the values for Xenopus (filled triangles) either lie on or just outside the lower standard deviation contour around the regression line (fig. 1 and additional analyses [not shown]). The
-actin gene of Xenopus has a markedly lower total GC content compared with all of the other species we examined. The outlier quality of the
-actin data for Xenopus suggests caution when considering the suitability of Xenopus as a "representative" ectotherm (see Discussion).
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
These findings represent a taxonomically diverse array of 51 vertebrate species from a wide range of physiological temperatures, including a number of ectotherms that have evolved under comparable or, in some cases, even greater physiological temperatures than those of mammals and birds (table 1). Because some ectotherms may experience a broad range of environmental temperatures, patterns associated with high, mean, and low temperatures were investigated independently, and all displayed essentially the same trends. It is reassuring to note that the conclusions are robust, regardless of whether the posited selection would be acting on maximal, mean, or minimal temperature experienced by an organism on an annual basis.
The observed patterns in our results are also consistent across total GC, GC3, and GC4. Each level not only represents a smaller fraction of the data set but also represents a step further away from confounding selective constraints based on limits to amino acid sequence as determined by constraints on the structure and function of the encoded proteins, alpha-actin and A4-LDH. In this regard, GC4 is the most instructive because it is free of all such selective constraints and yet does not demonstrate a significantly positive GC-temperature correlation.
The results reported for -actin and ldh-a corroborate conclusions reached in analyses of prokaryotic genomes, where the GC content was found not to correlate with the optimal temperature of numerous prokaryote species (Galtier and Lobry 1997). More specifically, GC3 did not correlate with optimal temperature when phylogenetic relatedness was controlled for in the analysis (Hurst and Merchant 2001).
The two genes that were examined here are both major housekeeping genes, essential to all vertebrates, and prime examples of the sort of "important proteins" that were suggested to be encoded by GC-rich sequences to confer a thermostability selective advantage (Bernardi 2000a, 2000b). Thus, it is difficult to argue that these two genes are any less important or representative of a vertebrate genome than any other two such genes. Nevertheless, it is clear that two genes can no more represent vertebrate genome evolution than two species can.
While there are too few genes available here to refute the thermostability hypothesis or alternatives similarly based on physiological temperature, our results do suggest caution when generalizing about causative factors in vertebrate genome evolution. The lack of correlations between temperature and GC content when a more careful examination of ectothermic physiological temperatures is incorporated suggests that the dichotomy between warm-blooded and cold-blooded, while tempting in its simplicity, may not fully explain the evolution of vertebrate GC-rich isochores. The extent to which GC-rich isochores are characteristic only of birds and mammals is another issue that requires further clarification. It may be the case that GC-rich isochores evolved before the evolution of endothermy. Indeed, Hughes, Zelus, and Mouchiroud (1999) found evidence for GC-rich isochores in two distantly related reptiles, the Nile crocodile and the red-eared slider turtle, albeit for a limited number of sequences. A final caveat raised by our study concerns the use of Xenopus spp. as representative of cold-blooded vertebrates in general. As the analyses of -actin consistently showed, Xenopus spp. tend to be outliers that differ from other ectothermic species.
A more complete picture of vertebrate genome evolution awaits the critical examination of GC content and codon bias in both dimensionsexpanding both the numbers of genes and the diversity of species examined. Hopefully, these sorts of data will soon be tractable to collect and analyze as we emerge into a new era of comparative genomics, where complete genomes of many diverse species become available.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bernardi, G. 1995. The human genome: organization and evolutionary history. Annu. Rev. Genet 29:445-76.[CrossRef][ISI][Medline]
Bernardi, G. 2000a. The compositional evolution of vertebrate genomes. Gene 259:31-43.[CrossRef][ISI][Medline]
Bernardi, G. 2000b. Isochores and the evolutionary genomics of vertebrates. Gene 241:3-17.[CrossRef][ISI][Medline]
Bernardi, G., and G. Bernardi. 1990. Compositional patterns in the nuclear genome of cold-blooded vertebrates. J. Mol. Evol 31:265-281.[ISI][Medline]
Bernardi, G., and 1991. Compositional properties of nuclear genes from cold-blooded vertebrates. J. Mol. Evol 33:57-67.[ISI]
Bernardi, G., B. Olofsson, J. Filipski, M. Zerial, J. Salinas, G. Cuny, M. Meunierrotival, and F. Rodier. 1985. The mosaic genome of warm-blooded vertebrates. Science 228:953-958.[ISI][Medline]
Cuny, G., P. Soriano, G. Macaya, and G. Bernardi. 1981. The major components of the mouse and human genomes. 1. Preparation, basic properties and compositional heterogeneity. Eur. J. Biochem 115:227-233.[Abstract]
Eyre-Walker, A., and L. D. Hurst. 2001. The evolution of isochores. Nature Rev. Genet 2:549-555.[CrossRef][ISI][Medline]
Felsenstein, J. 1985. Phylogenies and the comparative method. Am. Nat 125:1-15.[CrossRef][ISI]
Galtier, N., and J. R. Lobry. 1997. Relationships between genomic G+C content, RNA secondary structures, and optimal growth temperature in prokaryotes. J. Mol. Evol 44:632-636.[ISI][Medline]
Hughes, S., D. Zelus, and D. Mouchiroud. 1999. Warm-blooded isochore structure in Nile crocodile and turtle. Mol. Biol. Evol 16:1521-1527.[Abstract]
Hurst, L. D., and A. R. Merchant. 2001. High guanine-cytosine content is not an adaptation to high temperature: a comparative analysis amongst prokaryotes. Proc. R. Soc. Lond. B Biol. Sci 268:493-497.[CrossRef][ISI][Medline]
Macaya, G., J. P. Thiery, and G. Bernardi. 1976. An approach to the organization of eukaryotic genomes at a macromolecular level. J. Mol. Biol 108:237-254.[ISI][Medline]
Perrin, P., and G. Bernardi. 1987. Directional fixation of mutations in vertebrate evolution. J. Mol. Evol 26:301-310.[ISI][Medline]
Purvis, A., and A. Rambaut. 1995. Comparative analysis by independent contrasts (CAIC): an Apple Macintosh application for analysing comparative data. Comput. Appl. Biosci 11:247-251.[Abstract]
Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol 4:406-425.[Abstract]
Swofford, D. L. 1998. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer Associates, Sunderland, Mass.
Tamura, K., and M. Nei. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol 10:512-526.[Abstract]
Thiery, J. P., G. Macaya, and G. Bernardi. 1976. An analysis of eukaryotic genomes by density gradient centrifugation. J. Mol. Biol 108:219-235.[ISI][Medline]
Zoubak, S., O. Clay, and G. Bernardi. 1996. The gene distribution of the human genome. Gene 174:95-102.[CrossRef][ISI][Medline]