National Marine Fisheries Service, Northwest Fisheries Science Center, Conservation Biology Division, Seattle, Washington
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In an earlier paper, Ford, Thornton, and Park (1999)
compared replacement (dn) and silent (ds) substitution rates among four salmonid species at the transferrin gene and found dn/ds ratios significantly greater than 1.0 in three of the six pairwise comparisons. This result suggested that positive selection for new replacement alleles has played a large role in the evolution of transferrin within at least some salmonid species. Iron competition with salmonid pathogens could be one selective mechanism. In this paper, the earlier salmonid results are put in a broader phylogenetic context through analysis of transferrin sequence variation from 25 vertebrates, including 7 additional recently published salmonid species (Lee et al. 1998
; table 1
). The specific goals of the study were to identify where in the history of its evolution vertebrate transferrin had been subject to positive selection and to use recently developed likelihood techniques (Goldman and Yang 1994
; Nielsen and Yang 1998
; Yang 1998
) to identify specific positively selected sites. By mapping the sites that appear to be subject to positive selection onto specific functional regions of the protein, it should be possible to gain additional insight into the potential selective mechanisms responsible for the high dn/ds ratios among salmonid species.
|
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Estimation of dn/ds Ratios and Selected Sites
Several of the codon-based likelihood models of Nielsen and Yang (1998)
and Yang (1998)
were used to estimate dn/ds ratios for each branch of an estimated transferrin phylogeny. The likelihood models provide more accurate estimates of dn/ds ratios than do simpler approximation methods and allow specific selected sites to be identified (Yang and Nielsen 2000
). In order to produce valid results, the models require an accurate phylogeny of the sequences involved. In this study, phylogenies were estimated from the aligned DNA sequences using maximum-likelihood, parsimony and neighbor- joining methods (Saitou and Nei 1987) as implemented in the DNAML, DNAPARS, and DNADIST programs in the PHYLIP computer package (Felsenstein 1993
).
Several models of codon evolution were fitted either to the entire data set or to specific subsets of the data using the PAML computer package (Goldman and Yang 1994
; Nielsen and Yang 1998
; Yang 1998
). All the models used maximum-likelihood methods for estimating the parameters of a transition matrix describing the substitution rates between pairs of codons, including dn/ds ratios, transition/transversion ratios, and branch lengths. The simplest model was called the single-ratio model. This model estimated a single average dn/ds ratio,
, across every branch and every codon. The next model, called the free-ratio model, estimated a dn/ds ratio,
b, for every branch b in the tree. The third model, called the neutral model, assumed a single dn/ds ratio for all branches in the tree but allowed for two different types of sites, each with a different
value. Sites of the first type, in frequency p0, were subject to strong selection against replacement mutations and had a dn/ds ratio of
0 = 0. Sites of the second type, in frequency p1, were neutral and had a dn/ds ratio of
1 = 1. The fourth model, called the positive selection model, was the same as the neutral model except that there was a third class of sites, in frequency ps, that had a dn/ds ratio of
s. This model only provided support for positive selection if the estimate of
s was greater than 1.0. The probability that a particular codon site is positively selected can be estimated using the empirical Bayes' approach described by Nielsen and Yang (1998)
. For all models, the equilibrium codon frequencies were estimated from the products of the average observed nucleotide frequencies in the three codon positions (the f3x4 option in the PAML package). In order to determine if the model results were sensitive to assumptions about equilibrium codon frequencies, all models were also run under an assumption of equal codon frequencies (the 1/61 option in the PAML package). Both codon frequency assumptions produced very similar results, and only the results using the f3x4 assumption are reported here.
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
Inspection of the free-ratio model results clearly suggests different dn/ds ratios within the salmonids compared with other vertebrates. For this reason, and for reasons of computational efficiency, further analyses were applied to two smaller data sets: a data set consisting of all of the salmonid transferrin sequences and a data set consisting of the Eutherian mammal transferrin and lactoferrin sequences. The dn/ds ratios estimated under the free-ratio model for these smaller data sets were similar to the ratios estimated using the complete data set (figs. 2 and 3 ). The ratios estimated for the four alternative salmonid phylogenies were very similar to each other (fig. 2 ), showing that the results are robust to a variety of plausible phylogenies.
Neutral and Positive-Selection Models
Fitting the neutral and positive-selection models to the data is useful for two reasons. First, by comparing the fit of the two models, one can determine if the positive-selection model provides a significantly better fit to the data than the neutral model, and second, the positive-selection model can be used to identify specific sites that have been subject to positive selection. The relative fit of the two models can be evaluated using a likelihood ratio test (e.g., Nielsen and Yang 1998
and references therein). Under the hypothesis that the two nested models provide an equally good fit to the data, twice the log likelihood difference between the two models is expected to be approximately
2 distributed with the number of degrees of freedom equal to the difference in the number of parameters between the models.
For the salmonid data set, the positive-selection model provided a much better fit to the data than did the neutral model (twice the log likelihood difference between the two models varied from 130.5 to 142.4 depending on the phylogeny used, df = 2, P < 0.001; table 2 ). The positive-selection model estimated that 13%14% of the codons were subject to positive selection during their evolutionary history, with an average dn/ds ratio of 6.67.1 (table 2 )clear evidence of strong positive selection. The estimates of the proportion of neutral sites and sites subject to strong constraint were 39%41% and 46%47%, respectively (table 2 ).
|
For the Eutherian mammal sequences, the selection model also provided a much better fit to the data than did the neutral model (twice the log likelihood difference = 430.54, df = 2, P < 0.001; table 2 ). The dn/ds ratio for selected sites under the selection model, however, was estimated to be less than 1.0, so even though the selection model provided a better fit to the data than did the neutral model, the selection model did not support a hypothesis of positive selection. Rather, the model was simply more realistic than the neutral model because it had three classes of sites instead of two, thus allowing for greater variation in selective constraint among sites.
Within the Eutherian mammal tree, two branches (labeled "a" and "b" in fig. 3
) had estimated dn/ds ratios greater than 1.0 under the free-ratio model. In order to determine if the dn/ds ratios for these two branches were significantly greater than 1.0, two additional models were employed. The first additional model, the two-ratio model, assumed there were two dn/ds ratios within the Eutherian tree, one ratio for branches a and b, a,b, and another ratio for all other branches,
0. The second additional model, the two-ratio constrained model, was the same as the first except that
a,b was constrained to be equal to 1.0. The models differed by one degree of freedom, and a comparison of twice the log likelihood difference between the models indicated that the dn/ds ratios for the a and b branches were not significantly different from 1.0 (twice the log likelihood difference = 0.74, df = 1, P < 0.4; table 2
).
Locations of Selected Sites
The selection model estimates the proportion of sites in a gene subject to positive selection, but does not identify which specific sites have actually been selected. An empirical Bayes' approach can be used along with the results of the selection model, however, to estimate the probability that specific sites are subject to selection (Nielsen and Yang 1998
). Using this method, 29 sites in salmonid transferrin were estimated to be positively selected with posterior probability greater than 0.95 for at least one of the four alternative salmonid trees (table 3
).
|
The crystal structure of salmonid transferrin has not been determined, but the crystal structure of the N- and C-lobes of mammalian transferrins are very similar to each other (Anderson et al. 1989
). The divergence time between the two lobes predates the divergence times between mammals and fish (e.g., Baldwin 1993
), so it seems reasonable to assume that the three-dimensional structure of salmonid transferrin will be similar to that of the mammalian transferrin. The crystal structures of human transferrin, human lactoferrin, bovine lactoferrin, and rabbit transferrin are also all very similar (Moore et al. 1997
), providing additional confidence that the general three-dimensional structure of transferrin is likely to be quite conserved among diverse species. Figure 4 shows the structure of diferric bovine lactoferrin (Moore et al. 1997
), with the sites homologous to likely positively selected salmonid sites (table 2
) shaded dark gray. Three of the 29 salmon sites identified as positively selected did not have homologous sites in the bovine sequence. Consistent with the bacterial-selection hypothesis, the 26 remaining sites that were estimated to be subject to positive selection in salmonids were all found near the outside of the molecule, and in several cases sites which were not close to each other in the primary sequence were physically close together in the folded molecule.
|
Why Are dn/ds Ratios Not High Among Mammals or Other Nonsalmonids?
If interaction with bacterial iron acquisition proteins is the mechanism leading to positive selection at transferrin, it is puzzling that the evidence for such selection is found only in salmonids. Transferrin plays a role in disease resistance in mammals (reviewed by Martinez, Delgado-Iribarren, and Baquero 1990
), so if selection due to iron-binding competition from bacteria is a plausible selective mechanism in salmonids, it is also certainly a plausible selective mechanism in mammals. Furthermore, there is evidence that interactions between egg and sperm transferrin genotypes affect fertility in mammals (reviewed by Wedekind 1994
), providing independent evidence that extant transferrin alleles are indeed under selection in at least some mammalian species. There are, however, at least four possible explanations for the lack of evidence for positive selection outside of the salmonids. The first possible explanation can be ruled out with the data in hand; the other three are speculative, and additional experiments will be required to test them.
First, the branch lengths outside of the salmonid clade of the transferrin tree are for the most part much longer than the branch lengths in the salmonid clade (fig. 1 ), suggesting the possibility that saturation of nonsynonymous substitutions could result in underestimation of dn/ds ratios outside of the salmonid clade. This possibility was tested by using the evolver program in the PAML package (see Materials and Methods) to simulate trees with branch lengths and topologies identical to the Eutherian mammalian tree in figure 3 but with a parametric dn/ds ratio of 1.15 (equal to the single-ratio estimate for the salmonids; table 2 ). The results of these simulations show that the model can adequately estimate dn/ds ratios similar to the salmonid ratios for trees with branch lengths equal to those estimated for the mammalian transferrin tree (the mean estimated dn/ds ratio for 10 simulations was 1.23, with a standard deviation of 0.056). Similar results were obtained using a tree of all of the nonsalmonid sequences (the mean estimated dn/ds ratio for 10 simulations was 1.22, with a standard deviation of 0.074). These simulation results show that longer branch lengths cannot explain the lack of high dn/ds ratios outside of the salmonid clade.
Second, some investigators have argued that fish rely more heavily on their nonadaptive immune systems than do higher vertebrates, perhaps leading to stronger selection on fish transferrin (e.g., Nonaka and Smith 2000). This explanation fails to explain why salmonid transferrins should be subject to greater positive selection than the transferrins of the nonsalmonid fishes in this study, however.
Third, the salmonid species in this study are all at least partially diadromous and therefore occupy a great variety of habitats throughout their life cycles (freshwater, estuarine, marine). This diverse lifestyle may result in exposure to a large variety of pathogens, thus imposing especially strong selection on transferrin.
Finally, the Salmonidae are ancestrally tetraploid (Sola, Cataudella, and Capanna 1981
). There is evidence for duplicated transferrin genes in Atlantic salmon (Kvingedal, Rørvik, and Alestrøm 1993
), and it is likely that other salmonid species contain two transferrin genes as well. It is possible that the genomewide duplication event that occurred in the lineage leading to the Salmonidae facilitated rapid evolution and adaptation of duplicated genes (e.g., Zhang, Rosenberg, and Nei 1998
). Interestingly, the only two branches outside of the Salmonidae with dn/ds ratios greater than 1.0 (although not significantly so) occur soon after the duplication event leading to mammalian lactoferrin and transferrin (fig. 1
).
In summary, within the salmonids, the selection model provides a significantly better fit to the data than the neutral model, with 13% of the codons in the gene estimated to be subject to positive selection with a dn/ ds ratio of
7. Of these selected codons, 29 could be identified with confidence. These selected sites code for residues on the outside of the transferrin molecule, and approximately half of them fall in areas bound by bacterial transferrin-binding proteins. In contrast to the salmonid results, there was no evidence for positive selection in the nonsalmonid sequences in this study.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Keywords: positive selection
likelihood
disease resistance
transferrin
salmon
evolution
2 Address for correspondence and reprints: Michael J. Ford, National Marine Fisheries Service, Northwest Fisheries Science Center, Conservation Biology Division, 2725 Montlake Boulevard East, Seattle, Washington 98112. mike.ford{at}noaa.gov
![]() |
literature cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Anderson, B. F., H. M. Baker, G. E. Norris, D. W. Rice, and E. N. Baker. 1989. Structure of human lactoferrin: crystallographic structure analysis and refinement at 2.8 A resolution. J. Mol. Biol. 209:711734[ISI][Medline]
Baldwin, G. S. 1993. Comparison of transferrin sequences from different species. Comp. Biochem. Physiol. B 106: 203218
Banfield, D. K., B. K. Chow, W. D. Funk, K. A. Robertson, T. M. Umelas, R. C. Woodworth, and R. T. MacGillivray. 1991. The nucleotide sequence of rabbit liver transferrin cDNA. Biochim. Biophys. Acta 1089:262265
Benton, M. J. 1997. Vertebrate paleontology. Chapman and Hall, New York
Demmer, J., S. J. Stasuik, F. M. Adamski, and M. R. Grigor. 1999. Cloning and expression of the transferrin and ferritin genes in a marsupial, the brushtail possum (Trichosurus vulpecula). Biochim. Biophys. Acta 1445:6574
Evelyn, T. P. T. 1996. Infection and disease. Pp. 339366 in G. Iwama and T. Nakanishi, eds. The fish immune system organism, pathogen, and environment. Academic Press, San Diego
Felsenstein, J. 1993. PHYLIP (phylogeny inference package). Version 3.5c. Distributed by the author, Department of Genetics, University of Washington, Seattle. http://evolution.genetics.washington.edu/phylip.html
Ford, M. J., P. J. Thornton, and L. K. Park. 1999. Natural selection promotes divergence of transferrin among salmonid species. Mol. Ecol. 8:10551061[ISI][Medline]
Goldman, N., and Z. Yang. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11:725736
Gray-Owen, S. D., and A. B. Schryvers. 1996. Bacterial transferrin and lactoferrin receptors. Trends Microbiol. 4: 185191
Guerinot, M. L. 1994. Microbiol iron transport. Annu. Rev. Microbiol. 48:743772[ISI][Medline]
Hershberger, C. L., J. L. Larson, B. Arnold et al. (12 co- authors). 1991. A cloned gene for human transferrin. Ann. N.Y. Acad. Sci. 646:140154[ISI][Medline]
Hirono, I., T. Uchiyama, and T. Aoki. 1995. Cloning, nucleotide sequence analysis, and characterization of cDNA for medaka (Oryzias latipes) transferrin. Mol. Mar. Biol. Biotechnol. 2:193198
Hirst, I. D., and A. E. Ellis. 1996. Utilization of transferrin and salmon serum as sources of iron by typical and atypical strains of Aeromonas salmonicida. Microbiology 142:1543 1550
Hoshino, A., S. Hisayasu, and T. Shimada. 1996. Complete sequence analysis of rat transferrin and expression of transferrin but not lactoferrin in the digestive glands. Comp. Biochem. Physiol. B 113:491497
Kappeler, S. R., M. Ackerman, Z. Farah, and Z. Puhan. 1999. Sequence analysis of camel (Camelus dromedarius) lactoferrin. Int. Dairy J. 9:481486[ISI]
Kim, Y.-D., J.-Y. Lee, Y.-K. Hong, J. Hikima, I. Hirono, and A. Takashi. 1997. Molecular cloning and sequence analysis of transferrin cDNA from Japanese flounder Paralichthys olivaceus. Fish. Sci. 63:582586
Kvingedal, A. M., K. A. Rørvik, and P. Alestrøm. 1993. Cloning and characterization of Atlantic salmon (Salmo salar) serum transferrin cDNA. Mol. Mar. Biol. Biotechnol. 2:233238[Medline]
Le Provost, F., M. Nocart, G. Guerin, and P. Martin. 1994. Characterization of the goat lactoferrin cDNA. Assignment of the relevant locus to bovine U12 synteny group. Biochim. Biophys. Acta 203:13241332
Lee, J. Y., T. Tada, I. Hirono, and T. Aoki. 1998. Molecular cloning and evolution of transferrin cDNAs in salmonids. Mol. Mar. Biol. Biotechnol. 7:287293[ISI][Medline]
Lee, J.-Y., N. Tange, H. Yamashita, I. Hirono, and T. Aoki. 1995. Cloning and characterization of transferrin cDNA from coho salmon (Oncorhynchus kisutch). Fish Pathol. 30: 271277
Loehr, T. M., ed. 1989. Iron carriers and iron proteins. VCH, New York
Lyndon, J. P., B. R. O'Malley, O. Saucedo, T. Lee, D. R. Headon, and O. M. Conneely. 1992. Nucleotide and primary amino acid sequence of porcine lactoferrin. Biochim. Biophys. Acta 1132:9799
McKay, S. J., R. H. Devlin, and M. J. Smith. 1996. Phylogeny of Pacific salmon and trout based on growth hormone type-2 and mitochondrial NADH dehydrogenase subunit 3 DNA sequences. Can. J. Fish. Aquat. Sci. 53:11651176[ISI]
Martinez, J. L., A. Delgado-Iribarren, and F. Baquero. 1990. Mechanisms of iron acquisition and bacterial virulence. FEMS Microbiol. Rev. 75:4556[ISI]
Mazoy, R., and M. L. Lemos. 1991. Iron-binding proteins and heme compounds as iron sources for Vibria anguillarum. Curr. Microbiol. 23:221226
Moore, S. A., B. F. Anderson, C. R. Groom, M. Haridas, and E. N. Baker. 1997. Three-dimensional structure of diferric bovine lactoferrin at 2.8 A resolution. J. Mol. Biol. 274:222236[ISI][Medline]
Moskaitis, J. E., R. L. Pastori, and D. R. Schoenberg. 1990. The nucleotide sequence of Xenopus laevis transferrin mRNA. Nucleic Acids Res. 18:6135
Nei, M., and T. Gojobori. 1986. Simple methods for estimating the number of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418426[Abstract]
Nicholas, K. B., and H. B. Nicholas. 1997. GeneDoc: analysis and visualization of genetic variation. http://www.cris.com/Ketchup/genedoc.shtml
Nielsen, R., and Z. Yang. 1998. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929936
Nonaka, M., and S. L. Smith. 2000. Complement system of bony and cartilaginous fish. Fish Shellfish Immunol. 10: 215228
Oakley, T. H., and R. B. Phillips. 1999. Phylogeny of Salmonine fishes based on growth hormone introns: Atlantic (salmo) and Pacific (Oncorhynchus) salmon are not sister taxa. Mol. Phylogenet. Evol. 11:381393[ISI][Medline]
Pierce, A., D. Colavizza, M. Benaissa, P. Maes, A. Tartar, J. Montreuil, and G. Spik. 1991. Molecular cloning and sequence analysis of bovine lactotransferrin. Eur. J. Biochem. 196:177184[Abstract]
Retzer, M. D., A. Kabani, L. L. Button, R. H. Yu, and A. B. Schryvers. 1996. Production and characterization of chimeric transferrins for the determination of the binding domains for bacterial transferrin receptors. J. Biol. Chem. 271:11661173
Retzer, M. D., R. Yu, and A. B. Schryvers. 1999. Identification of sequences in human transferrin that bind to the bacterial receptor protein, transferrin-binding protein B. Mol. Microbiol. 32:111121[ISI][Medline]
Rey, M. W., S. L. Woloshuk, H. A. deBoer, and F. R. Pieper. 1990. Complete nucleotide sequence of human mammary gland lactoferrin. Nucleic Acids Res. 18:5288
Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406425[Abstract]
Sola, L. S., S. Cataudella, and E. Capanna. 1981. New developments in vertebrate cytotaxonomy. III. Karyology of bony fishes: a review. Genetica 54:285328
Stearley, R. F., and G. R. Smith. 1993. Phylogeny of the Pacific trouts and salmons (Oncorhynchus) and genera of the family Salmonidae. Trans. Am. Fish. Soc. 122:133[ISI]
Suzumoto, B. K., C. B. Schreck, and J. D. McIntyre. 1977. Relative resistances of three transferrin genotypes of coho salmon (Oncorhynchus kisutch) and their hematological responses to bacterial kidney disease. J. Fish. Res. Board Can. 34:18[ISI]
Tange, N., J.-Y. Lee, N. Mikawa, I. Hirono, and T. Aoki. 1997. Cloning and characterization of transferrin cDNA and rapid detection of transferrin gene polymorphism in rainbow trout (Oncorhynchus mykiss). Mol. Mar. Biol. Biotechnol. 6:354359
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:46374680
Wedekind, C. 1994. Mate choice and maternal selection for specific parasite resistances before, during and after fertilization. Philos. Trans. R. Soc. Lond. B Biol. Sci. 346:303 311.[ISI][Medline]
Winter, G. W., C. B. Schreck, and J. D. McIntyre. 1980. Resistance of different stocks and transferrin genotypes of coho salmon, Oncorhynchus kisutch, and steelhead trout, Salmo gairdneri, to bacterial kidney disease and vibriosis. Fish. Bull. 77:795802[ISI]
Yang, Z. 1998. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15:568573[Abstract]
Yang, Z., and R. Nielsen. 2000. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 17:3243
Zhang, J., H. F. Rosenberg, and M. Nei. 1998. Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc. Natl. Acad. Sci. USA 95:37083713