Evolutionary Rates of Duplicate Genes in Fish and Mammals

Marc Robinson-Rechavi2, and Vincent Laudet

Centre National de la Recherche Scientifique UMR 5665, Laboratoire de Biologie Moléculaire et Cellulaire, Ecole Normale Supérieure de Lyon, Lyon, France

Recently, much attention has been attracted by the abundance of duplicate genes in teleost fish (Amores et al. 1998Citation ; Wittbrodt, Meyer, and Schartl 1998Citation ). It has been suggested that this abundance reflected an ancestral genome duplication and that it may be related to the great diversity of this group (Vogel 1998Citation ). Emphasizing the importance of gene duplication in evolution, Ohno (1970)Citation pointed out that at least one of the two copies may become less constrained by selection and thus be able to evolve toward a new function. Hughes and Hughes (1993)Citation tested this hypothesis in the recent tetraploid Xenopus laevis and showed that both duplicate copies evolve at the same rate, with evidence of negative selection on both.

Our aim was to characterize whether duplicated genes have allowed a further exploration of the adaptative landscape in teleost fish and mammalian genomes. For this, we compared all genes for which at least one duplication was characterized either in a teleost fish or in a eutherian mammal. Although pseudogenes are relevant to the overall duplication rate, they do not participate in an eventual adaptative role of gene or genome duplication. Since we were interested here in functional evolution, we did not take into account duplications which clearly led to the formation of pseudogenes. This was done by the exclusion in the HOVERGEN database (Duret, Mouchiroud, and Gouy 1994Citation ) of all sequences explicitly declared to be pseudogenes.

For each of these genes, we compared rates (1) between the two duplicated copies, as in Hughes and Hughes (1993)Citation , and (2) between the lineage with the duplication and the lineage without the duplication. Indeed, if gene duplication is followed by a relaxation of selective constraint on genes, the resulting copies should evolve faster than their orthologs in other species which were not duplicated. The second test should be able to detect a decrease in selection on either or both copies, unlike the first. In practice, we build a 2 x 2 contingency table to compare the observed and expected numbers of genes under the hypothesis of independence between the lineage with the duplication and the lineage with the highest substitution rate (mammals or teleost fish).

Genes were recovered from the HOVERGEN database (Duret, Mouchiroud, and Gouy 1994Citation ), allowing for selection of orthologs and of lineage-specific duplications. For rate tests, only genes with a clear outgroup were retained. We thus obtained a data set of 19 genes, of which 5 showed a specific duplication in the mammalian lineage, and 14 showed a specific duplication in the teleost lineage (table 1 ). Amino acid sequences were aligned by CLUSTAL W (Thompson, Higgins, and Gibson 1994Citation ) and corrected by eye in with Seaview (Galtier, Gouy, and Gautier 1996Citation ). DNA sequences were not directly used, as synonymous substitutions were saturated over the evolutionary distances we considered. Phylogenetic analyses were done with Phylo_win (Galtier, Gouy, and Gautier 1996Citation ) using neighbor joining (Saitou and Nei 1987Citation ) with Poisson-corrected distances. Rates were compared between lineages using the relative-rate test on all available sequences (Wilson, Carlson, and White 1977Citation ; Robinson et al. 1998Citation ), with a Poisson correction for multiple substitutions, implemented in RRTree (Robinson-Rechavi and Huchon 2000Citation ). Although the relative-rate test normally relies on an assumption of constant rates over time, the test remains valid under a fractal model of rate variation over time (Bickel 2000Citation ). In any case, we were interested in testing whether or not observed rates variation are linked to gene duplications, and random error eventually linked to nonconstant evolution rates may introduce some noise but probably not a systematic bias.


View this table:
[in this window]
[in a new window]
 
Table 1 Genes Used in this Study

 
Four genes out of 19 exhibited significant rate differences between the two (or more) duplicate genes at the 5% level (table 1 ), but none exhibited such differences at the 0.26% level (5%, 19 repetitions), which should be used to take into account test repetition. If indeed duplication releases protein evolution, we expect faster evolution in the lineage in which the duplication happened; it is immediately apparent from table 2 that there is no relationship between gene duplication and evolution rate. Indeed, the expected numbers of genes in each category are almost identical to those expected by chance: evolution is not faster for duplicated genes.


View this table:
[in this window]
[in a new window]
 
Table 2 Rate Comparisons for Duplicated Genes

 
As expected under the "more genes in fish" hypothesis (Wittbrodt, Meyer, and Schartl 1998Citation ), there were more duplicated genes in the fish lineage than in the mammalian lineage. A less expected result was that most genes by far (16/19) had higher rates in fish than in mammals, irrespective of duplication, including in the five cases where the test was significant at the 5% level. Choosing mice to represent mammals, and zebrafish to represent teleost fish, we concatenated all genes, choosing one copy randomly every time there was a duplication. With this alignment of 7,483 amino acids, the zebrafish lineage evolved very significantly faster than the mouse lineage (rate difference {Delta}K = 0.058 ± 0.009, significant at a 10-7 threshold), even though genes are known to evolve faster in murids than in other mammals (Wu and Li 1985Citation ). This clearly confirms that genes evolve faster in fish than in mammals.

Various life history parameters have been suggested to correlate with substitution rate, but metabolic rates (Martin and Palumbi 1993Citation ) are clearly lower in fish than in mammals, while it is hardly possible to define generation time (Wu and Li 1985Citation ) over groups as vast as mammals and teleost fish. The main hypothesis to explain the origin of so many duplicate genes in teleost fish is an ancestral genome duplication (Amores et al. 1998Citation ). However, systematic phylogenetic analysis leads us to propose rather a model of many independent gene or chromosome duplications (data not shown). This is consistent with a proposed model of frequent single hox cluster duplications and losses (Stellwag 1999Citation ). Spliceosomal introns also appear to have been gained many times independently in different fish lineages (Venkatesh, Ning, and Brenner 1999Citation ). Thus, high substitution rates are probably another manifestation of the dynamism of fish genomes, and it seems that, genetically at least, fish are anything but "primitive" vertebrates.

There does not appear to be any specific relaxation of selection on duplicate genes leading to higher rates of evolution and thus further exploration of the adaptive space. Hughes (1994)Citation suggested that after duplication, both copies retain the same activity but gain different specificities, notably in their regulation. An alternative model is that duplication allows divergence of a few otherwise highly constrained amino acids, leading to functional divergence. Although no systematic study seems to have been conducted, most experimental results for fish indicate that divergence of duplicate genes in fish affects expression patterns more than protein activity, in support of Hughes' model.

Acknowledgements

We thank Hector Escriva and two anonymous reviewers for critical reading, and Association de Recherche sur le Cancer, Centre National pour la Recherche Scientifique, the European Molecular Biology Organisation, Fondation pour la Recherche Médicale, Ligue Nationale contre le Cancer, and Ministère de l'Education Nationale for financial support.

Footnotes

Naruya Saitou, Reviewing Editor

1 Keywords: selection generation time relative-rate test molecular clock genome duplication Back

2 Address for correspondence and reprints: Marc Robinson-Rechavi, Centre National de la Recherche Scientifique UMR 5665, Laboratoire de Biologie Moléculaire et Cellulaire, Ecole Normale Supérieure de Lyon, 46 Allée d'Italie, 69364 Lyon cedex 07, France. E- mail: marc.robinson{at}ens-lyon.fr Back

literature cited

    Amores, A., A. Force, Y. L. Yan et al. (13 co-authors). 1998. Zebrafish hox clusters and vertebrate genome evolution. Science 282:1711–1714

    Bickel, D. R. 2000. Implications of fluctuations in substitution rates: impact on the uncertainty of branch lengths and on relative-rate tests. J. Mol. Evol. 50:381–390[ISI][Medline]

    Duret, L., D. Mouchiroud, and M. Gouy. 1994. HOVERGEN: a database of homologous vertebrate genes. Nucleic Acids Res. 22:2360–2365[Abstract]

    Galtier, N., M. Gouy, and C. Gautier. 1996. SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput. Appl. Biosci. 12:543– 548[Abstract]

    Hughes, A. L. 1994. The evolution of functionally novel proteins after gene duplication. Proc. R. Soc. Lond. B Biol. Sci. 256:119–124[ISI][Medline]

    Hughes, M. K., and A. L. Hughes. 1993. Evolution of duplicate genes in a tetraploid animal, Xenopus laevis. Mol. Biol. Evol. 10:1360–1369[Abstract]

    Martin, A. P., and S. R. Palumbi. 1993. Body size, metabolic rate, generation time, and the molecular clock. Proc. Natl. Acad. Sci. USA 90:4087–4091

    Ohno, S. 1970. Evolution by gene duplication. Springer-Verlag, New York

    Robinson, M., M. Gouy, C. Gautier, and D. Mouchiroud. 1998. Sensitivity of the relative-rate test to taxonomic sampling. Mol. Biol. Evol. 15:1091–1098[Abstract]

    Robinson-Rechavi, M., and D. Huchon. 2000. RRTree: relative-rate tests between groups of sequences on a phylogenetic tree. Bioinformatics 16:296–297

    Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425[Abstract]

    Stellwag, E. J. 1999. Hox gene duplication in fish. Semin. Cell Dev. Biol. 10:531–540[ISI][Medline]

    Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680[Abstract]

    Venkatesh, B., Y. Ning, and S. Brenner. 1999. Late changes in spliceosomal introns define clades in vertebrate evolution. Proc. Natl. Acad. Sci. USA 96:10267–10271

    Vogel, G. 1998. Doubled genes may explain fish diversity. Science 281:1119–1121

    Wilson, A. C., S. S. Carlson, and T. J. White. 1977. Biochemical evolution. Annu. Rev. Biochem. 46:573–639[ISI][Medline]

    Wittbrodt, J., A. Meyer, and M. Schartl. 1998. More genes in fish? Bioessays 20:511–515

    Wu, C. I., and W. H. Li. 1985. Evidence for higher rates of nucleotide substitution in rodents than in man. Proc. Natl. Acad. Sci. USA 82:1741–1745

Accepted for publication November 14, 2000.