*Galton Laboratory, Department of Biology, University College London, London, England;
Department of Molecular Biology and Genetics, Cornell University;
and
Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San Diego, La Jolla, California
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Abalones are marine mollusks (order Archeogastropoda) that broadcast and spawn their gametes into seawater where fertilization and embryogenesis occur. Many sympatric abalone species overlap in depth zonation and reproductive seasonality, yet maintain themselves as distinct species. Molecular incompatibilities in the gamete recognition system may be responsible for reproductive barriers between species. Lysin is a 16 kDa protein that is released by the sperm onto the egg vitelline envelope (VE), an elevated, glycoproteinacious, protective microchamber in which development occurs. Lysin is a nonenzymatic protein that unravels the tightly intertwined glycoprotein fibers of the VE to create a 3-µm-diameter hole through which the sperm passes prior to fusing with the egg cell membrane. The species specificity of lysin's ability to dissolve isolated egg VEs can be quantitatively demonstrated (reviewed in Vacquier et al. 1999
). Analysis of mitochondrial DNA sequences permitted dating of the abalone species divergence times and revealed that lysins evolve at surprisingly rapid rates (Metz, Robles-Sikisaka, and Vacquier 1998
).
Lysin is the first gamete recognition protein whose crystal structure has been solved (Shaw et al. 1993, 1995
; Kresge, Vacquier, and Stout 2000a, 2000b
). Red (Haliotis rufescens) and green (Haliotis fulgens) abalone sperm lysins differ in 51 out of 134 amino acid positions. Although highly divergent in amino acid sequence, there are several conserved features between these two lysins. For example, the alpha-carbon atom ribbon diagrams of the two lysins are essentially identical and thus superimposable with an average rms deviation of 1.14 Å. Other conserved structural features include one surface of the protein having a hydrophobic patch of 1116 amino acids (by which lysin dimerizes) and, on the opposite surface, two tracks of basic amino acids which run down the entire length of the lysin. Variations in sequence between species map to the lysin surface, where they could be involved in species-specific recognition of the VE (Kresge, Vacquier, and Stout 2000a, 2000b
). The N- and C-termini lie on the same surface of the lysin. The N-terminal residues 212 are always species-unique in sequence, while the C-terminus is moderately variable, but not species-unique. Recombinant lysins, in which both termini were exchanged between two species, demonstrated that these domains play important roles in species-specific recognition leading to VE dissolution (Lyon and Vacquier 1999
).
The only molecule in the VE-binding lysin with high, species-selective affinity is a fibrous glycoprotein of 1,000 kDa named VERL (vitelline envelope receptor for lysin). VERL is a major structural element of the egg VE. Cloning and sequencing showed that VERL was largely composed of approximately 28 tandemly repeating (intronless) 153amino acid motifs. Sequencing of VERL repeats from the seven species of Californian abalone showed that VERL repeats were subjected to weak purifying selection and evolved by the process of concerted evolution (Elder and Turner 1995
). This is in contrast to lysin, VERL's cognate binding partner, which evolves rapidly in response to strong selection pressure.
Lee, Ota, and Vacquier (1995)
determined cDNA sequences for lysin from 20 worldwide abalone species and performed pairwise sequence comparisons to estimate dS and dN using the method of Nei and Gojobori (1986
; hereinafter referred to as NG). The
ratio was found to be >1 when closely related species were compared but <1 when divergent species were compared (see fig. 3
in Lee, Ota, and Vacquier 1995
). Lee, Ota, and Vacquier (1995)
hypothesized that continuous selective pressure may have driven lysin evolution and that the small estimates of the
ratio in comparisons of divergent species may be due to saturation of nonsynonymous substitutions and functional constraints on lysin structure.
|
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Models of variable ratios among lineages were fitted by ML to the alignment of 25 sequences (Yang 1998
; Yang and Nielsen 1998
). The "one-ratio" model assumes the same
ratio for all branches. The free-ratios model assumes an independent
ratio for each branch. Comparison of the two models constitutes an LRT of the hypothesis that the
ratio is identical among lineages. The free-ratios model is parameter-rich and is unlikely to produce accurate estimates for all
ratios. Nevertheless, it is interesting to estimate the
ratios without constraints, as knowledge of which lineages are under diversifying selection may provide clues to the selective pressure driving lysin's divergence. For example, if closely related, sympatric species show strong diversifying selection, while distant allopatric species show purifying selection, a selective pressure related to speciation may be implicated. Such a hypothesis can be implemented as a "two-ratios" model that assumes different
ratios for sympatric and allopatric lineages in the phylogeny.
Models of variable ratios among sites were used to test for the presence of sites under diversifying selection (with
> 1) and to identify them (Nielsen and Yang 1998
). Note that in this paper, a site refers to an amino acid or codon rather than a nucleotide. We use the following five models for the
distribution (table 1
), implemented in the CODEML program of the PAML package (Yang 1999
; Yang et al. 2000
). Model M1 (neutral) assumes two classes of sites in the protein: the conserved sites at which
= 0 and the neutral sites at which
= 1. Model M2 (selection) adds a third class of sites with
as a free parameter, thus allowing for sites with
> 1. Model M3 (discrete) uses a general discrete distribution with three site classes, with the proportions (p0, p1, and p2) and the
ratios (
0,
1, and
2) estimated from the data. Model M7 (beta) uses a beta distribution B(p, q), which, depending on parameters p and q, can take various shapes (such as L, J, U, and inverted U shapes) in the interval (0, 1). Model M8 (beta and
) adds an extra class of sites to the beta (M7) model, with the proportion and the
ratio estimated from the data, thus allowing for sites with
> 1. From these models, we construct three LRTs (table 2
), which compare M0 (one ratio) with M3 (discrete), M1 (neutral) with M2 (selection), and M7 (beta) with M8 (beta &
), respectively. When the alternative models (M2, M3, and M8) suggest the presence of sites with
> 1, all three tests can be considered tests of positive selection (Nielsen and Yang 1998
; Yang et al. 2000
). However, the comparison of M0 with M3 may also be considered a test of variable
values among sites. After ML estimates of parameters are obtained, the Bayes theorem is used to calculate the posterior probabilities of site classes for each site (Nielsen and Yang 1998
). If the
ratios for some site classes are >1, sites with high posterior probabilities for those classes are likely to be under diversifying selection.
|
|
Computer Simulation
Computer simulation was performed to examine the effect of saturation of nonsynonymous substitutions (e.g., due to functional constraints on lysin) on the analysis of variable selective pressures among lineages. Codon sequences were simulated using the EVOLVER program in the PAML package. The program generates a codon sequence for the root of the tree and "evolves" the sequence along branches in the phylogeny using specified branch lengths and substitution parameters. The ratio was assumed to be either constant across sites or variable according to a discrete distribution. Each simulated data set was analyzed in the same way as the original data either by pairwise comparison or by ML joint analysis. Values of parameters used in the simulation are described later.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
The hypothesis = 1 can be tested using an LRT comparing the null model H0 with
= 1 fixed and the alternative model with
estimated from the data. Suppose the log-likelihood values under the two models are
0 and
1, respectively. Then, 2
= 2(
1 -
0) can be compared with a
2 distribution with df = 1 to test whether
is significantly different from 1. Use of the
2 approximation is reliable when the sample size is "large" or when the sequence is long. As the lysin gene has only about 130 codons, we performed computer simulations to check the reliability of the
2 approximation. We examined two pairwise comparisons, with the estimated
being >1 and <1, respectively. The first comparison was between H. rufescens and H. sorenseni. The estimates under H1 were = 0.149 and = 5.572, with
1 = -634.04, while under H0, = 0.149 and
1 = -636.09. Thus, 2
= 2(
1 -
0) = 4.10, with a P value of 0.04 from the
2 distribution. We simulated 200 replicate data sets using parameter estimates from H0, and for each data set, we calculated the log-likelihood values under H0 and H1. The test statistics among replicates are used to construct a histogram. The observed value is at the fifth percentile of the simulated distribution. This P value is close to that from the
2 distribution. A second case is that of the comparison between H. rufescens and H. roei. Estimates of parameters under H1 were = 1.411 and = 0.482, with
1 = -910.15, while under H0, = 1.348 and
1 = -914.01. Thus, 2
= 7.73, and P = 0.0054 from the
2 distribution. The P value estimated from 1,000 replicates simulated under H0 was 0.010. In both comparisons, the P values from the
2 distribution are close to those from the simulated distributions. While the limitations of pairwise comparison should be borne in mind, the
2 approximation to the LRT appears usable even for pairwise comparisons and small genes such as lysin. Use of the
2 approximation for data of multiple sequences, which contain more information, is expected to be more reliable.
The LRT statistics are plotted in figure 2C
as a function of the estimated ratio. For 12 comparisons,
is >1 at the 1% significance level (fig. 2C
). In 13 additional pairs,
is >1 at the 5% level. It is noteworthy that a large estimate of the
ratio is not necessarily strong evidence for adaptive evolution, as an estimate based on very few changes is unreliable. In two comparisons, that between H. discus hannai and H. gigantea and that between H. rubra and H. conicopora, two nonsynonymous differences and no synonymous differences were found. The estimate of
in both comparisons was
(dN = 0.0065, dS = 0). However, this ratio is not significantly >1; the test statistic 2
= 1.0 is rather small, with P = 0.3. Because there were only two changes, we are unable to rule out chance effects.
For distant comparisons, the estimated ratios are <1. In 107 comparisons,
is significantly <1 at the 1% level, while in 49 additional comparisons,
is significantly <1 at the 5% level. For the remaining 119 comparisons, the null hypothesis
= 1 cannot be rejected. While many more pairs show
< 1 than
> 1, the pairwise comparisons are not independent and the patterns are not easy to interpret.
Variable Selective Pressures and Positive Selection Along Lineages
The one-ratio model assumes the same ratio for all lineages (fig. 1
) and involves a total of 58 parameters: 47 branch lengths, 9 parameters for the nucleotide frequencies at the three codon positions, the transition/transversion rate ratio
, and the dN/dS ratio
. The log-likelihood value under this model was
0 = -4,682.42, with parameter estimates
= 1.56 and
= 0.929. This
ratio was an average over all sites and lineages. While larger than estimates from most other genes (see, e.g., Li 1997
; Yang and Nielsen 1998
), this ratio was <1. The free-ratios model, which assumes an independent
ratio for each branch, was then applied to the same data. The tree in figure 1
has 47 branches, so 46 additional
parameters are involved in this model. Estimates of the
ratios are shown along branches in figure 1
. The likelihood value under this model was
1 = -4,627.27. Comparison of 2
= 2(
1 -
0) = 2 x 55.15 = 110.30 with the
2 distribution (df = 46) suggests rejection of the one-ratio model, with P = 0.3 x 106. The
ratios are extremely variable among lineages.
Estimates of ratios in figure 1
suggest that the
ratios for recent lineages linking closely related or sympatric species are most often >1, while those for branches separating distantly related species are all <1. For example, in the North Pacific clade (species 110 from California and Japan), all except two very short branches have estimates of
> 1. A similar pattern is seen among closely related species in the Australia clade (species 1116). One explanation is that lysin may be under pressure to evolve to establish reproductive isolation during sympatric speciation, while such pressure is absent in allopatric lineages or after reproductive barriers are well established. One difficulty with testing such a hypothesis is that the models used here only describe the average pattern along each lineage. If a lineage undergoes a short episode of positive selection but is under purifying selection most of the time, the average
ratio for the lineage may not be >1. Another difficulty with testing the hypothesis is that we do not know whether a node represents a sympatric or an allopatric speciation event. Here, we used a phylogeny (fig. 3
) for a subset of species from the North Pacific (species 110) and Australia (species 1116) clades to fit a two-ratios model, assuming a ratio
0 for the branch connecting the two clades and another ratio
1 for branches within the two clades. The one-ratio model for this subset data gives
0 = -2,719.60, with the estimate = 1.450. The two-ratios model gives
1 = -2,698.46, with 0 = 0.272 between clades and 1 = 2.396 within clades. The LRT suggested that the two
ratios were significantly different; 2
= 2(
1 -
0) = 2 x 21.14 = 42.28, and P < 0.8 x 10-10 with df = 1. Furthermore, an LRT comparing models with
1 = 1 constrained and without such constraint suggested that
1 was significantly >1; 2
= 2 x (-2,698.46 - (-2,712.97)) = 29.02 is much greater than
21% = 6.63 with df = 1.
However, the two branches separating the Japanese (H. sieboldii, H. discus hannai, and H. gigantea) and Californian species probably represent allopatric rather than sympatric speciation. When those two branches were assigned the ratio 0, the same conclusions as above were reached. The two-ratios model gave
1 = -2,706.10 with 0 = 0.477 and 1 = 2.326, and the two-ratios model with
1 = 1 constrained gave
0 = -2,718.76. Thus,
1 was significantly >
0 (2
= 27.00) and also significantly >1 (2
= 25.32). Nevertheless, the large estimates of
for the two lineages under the free-ratios model, which were averages over all sites, did not appear to be due to chance effects and were in contradiction to the hypothesis.
Variable Selective Pressures Among Sites and Identification of Amino Acids Under Diversifying Selection
Table 1 lists parameter estimates and log-likelihood values under models of variable ratios among sites. Model M0 (one ratio) assumes the same ratio for all sites and fits the data far more poorly than any of the other models, which account for variable
ratios across sites. For example, M3 (discrete) involves four more parameters than M0 (one ratio), and the LRT statistic 2
= 436.14 is much greater than the critical value
21% = 13.28 with df = 4 (table 2
). The results suggest extreme variation in selective pressure among amino acid sites.
All three models that allow for the presence of positively selected sites, i.e., M2 (selection), M3 (discrete), and M8 (beta & ), do suggest the presence of such sites (table 1
). Allowing for the presence of positively selected sites (with
> 1) improves the fit of the models significantly. For example, the neutral model (M1) does not allow for sites with
> 1. The selection model (M2) adds an additional site class, with the
ratio estimated to be 3.8. The log-likelihood improvement was huge, as 2
= 191.90 should be compared with
21% = 9.21 with df = 2 (table 2
). Comparison between M7 (beta) and M8 (beta &
) produced similar results (table 2
). M7 (beta) fits the data better than M1 (neutral), as it allows for sites at which 0 <
< 1 (table 1
).
Posterior probabilities for site classes calculated under M3 (discrete) are plotted in figure 4
. ML estimation suggests that the three site classes are in proportions p0 = 0.329, p1 = 0.402, and p2 = 0.269, with the ratios 0 = 0.085,
1 = 0.911, and
2 = 3.065 (table 1
). Those proportions are the prior probabilities that any site belongs to each of the three classes. The data (codon configurations in different species) at a site alter the prior probabilities dramatically, such that the posterior probabilities may be very different from the prior probabilities. For example, the posterior probabilities for site 1 are 0.944, 0.056, and 0.000, and this site is very likely to be under strong purifying selection. The probabilities for site 4 are 0.000, 0.000, and 1.000, and this site is almost certainly under diversifying selection. The analysis was also performed under models M2 (selection) and M8 (beta &
), and the results (not presented) were highly similar. For example, the probabilities that site 127 belongs to the class of positively selected sites (with the
ratio being 3.82 under M2, 3.07 under M3, and 2.96 under M8; table 1
) were 0.979, 0.957, and 0.950 under the three models, respectively. For site 116, the corresponding probabilities were 0.872, 0.806, and 0.802. Table 1 lists sites inferred to be under positive selection under different models at the 95% cutting point.
|
|
We distinguished between two factors. The first was saturation, that is, inadequacy of the estimation procedure to correct for multiple nonsynonymous substitutions at the same site at high sequence divergences. When the substitution model is correct, the ML method does not appear to involve such a bias. Previous simulations (Yang and Nielsen 2000
) suggest that the ML method tends to overestimate rather than underestimate
in short sequences when
> 1. We generated a data set under the one-ratio model with
= 1.6 and
= 3 and with branch lengths three times as large as those estimated from the lysin data. The sequences were more divergent and the
ratio was much higher than in lysin. For this data set, NG underestimated the
ratio in almost all pairwise comparisons, with the average
= 1.8. ML produced estimates around the correct value of 3 (with the average
= 3.0) although with large sampling errors at high divergences.
The second factor was variation of selective pressures or ratios among sites. To examine whether such a variation caused the LRT to suggest incorrectly significant differences among lineages, we simulated data sets under model M3 (discrete) using parameter estimates obtained from the lysin data under the same model (table 1
). Each simulated replicate was then analyzed using two models: the one-ratio model assuming one
for all branches, and the free-ratios model assuming one
ratio for each branch. Both models assume no variation among sites. The statistic 2
was then calculated. As the free-ratios model is computation-intensive, only 10 replicates were simulated. In none of the 10 replicates did the LRT reject the null model of one
ratio for all lineages. The statistic 2
ranged from 32.5 to 60.2 among replicates. Note that if the LRT is strictly correct, there will be a 5% chance of rejecting the null hypothesis at the 5% level.
In another set of simulations, 12 replicates were simulated under model M3 (discrete) with three classes of sites in proportions 0.2, 0.3, and 0.5 with ratios of 0.1, 1, and 10. The branch lengths used were three times those estimated from the lysin data. The 2
statistics ranged from 40.4 to 78.9 among replicates. The null model of one
ratio for all lineages was incorrectly rejected in four replicates at the 1% level (
2 = 71.20) and in two more replicates at the 5% level (
2 = 62.83). The extreme nonsynonymous rate variation among sites and the high sequence divergence had a substantial effect on the LRT, causing far more false positives than indicated by the
2 distribution. Nevertheless, the observed statistic, 110.30, was well outside the range from the simulation performed under even such extreme parameter values. It is thus safe to conclude that the
ratios are variable among lineages, even though the LRT ignores variable selective pressures at sites.
Tracing Evolutionary Changes at Amino Acid Sites by Reconstructing Ancestral Lysins
Codons and amino acids at ancestral nodes were inferred using the empirical Bayes method (Yang, Kumar, and Nei 1995
) for the subtree of species 116 (fig. 3
). The codon-based model assumed two
ratios, one for lineages within the North Pacific and Australian clades, and another for the branch between the two clades. Ancestral amino acids were then used to map changes onto branches in the phylogeny (fig. 3
). The ancestral lysin proteins may be useful for future laboratory studies.
As mentioned earlier, in two pairwise comparisons, that between H. discus hannai and H. gigantea and that between H. rubra and H. conicopora, two nonsynonymous differences and no synonymous differences were found. These two species pairs deserve special attention. Ancestral reconstruction suggests that the ancestor of H. discus hannai and H. gigantea had GGA (Gly) at site 116 and GAA (Glu) at site 120, both with probability 1.00, and that the differences between the two species are due to a GGA (Gly) AGA (Arg) change at site 116 and a GAA (Glu)
GCA (Ala) change at site 120 along the H. discus hannai lineage. The ancestor of H. rubra and H. conicopora had GCC (Ala) at site 32 with probability 1.00 and AAC (Asn) at site 74 with probability 0.97, so the differences between the two species are most likely due to a GCC (Ala)
ACC (Thr) change at site 32 and an AAC (Asn)
GAC (Asp) change at site 74 along the H. rubra lineage. Note that previous analysis suggests that sites 32, 74, and 120 (P > 0.99) and site 116 (P
0.8) are under diversifying selection (table 1
and fig. 4
). While the individual pairwise comparisons discussed earlier cannot rule out chance effects as the cause for the observed differences in the two species pairs, the combined evidence makes it most likely that those differences occurred not by chance, but by selective pressure. It would be interesting to examine the efficiency of the egg-sperm interaction within and between species for these species pairs.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
While the data provide support for variable selective pressures among lineages, indicating episodic evolution in lysin, it is unclear what may have caused this variation. Estimates of the ratios under the free-ratios model suggest that recent lineages of closely related sympatric species tend to be under diversifying selection, while old lineages separating distantly related species tend to lack such pressure. One hypothesis is that lysin evolution is driven by the selective pressure to establish cross-species reproductive barriers, possibly through reinforcement (Dobzhansky 1940
). However, this explanation is contradicted by the high estimates of the
ratio for the two branches separating the California and Japanese species, which most likely represent an allopatric speciation event.
Another factor that may cause variable ratios among lineages is population size fluctuation. The importance of selection relative to random drift increases with the population size. Thus, a slightly deleterious mutation will have a greater chance of getting fixed, and a slightly advantageous mutation will have a reduced chance of fixation in a smaller population (Ohta 1973
). The majority of lineages with estimated
< 1 involve potential dispersal (founding) events, which might result in small population sizes. When the population is small, random fixations of mutations may be as important in lysin evolution as positive selection. Small population sizes may also lead to low animal density and low sperm concentration, and thus reduced sperm competition, a factor that may be driving lysin evolution. While differences in population size may create variable
ratios among lineages, they do not seem likely to change the direction of selection or to produce both lineages with
> 1 and
< 1.
An additional hypothesis incorporates the evolution of VERL repeats by concerted evolution (Elder and Turner 1995
). It has been hypothesized that lysin evolves to match the VERL repeats, which are being homogenized by concerted evolution (Swanson and Vacquier 1998
). Positive selection may favor lysins which match the new VERL repeat or lysins which tend to conserve lysin's three dimensional structure while diversifying its amino acid side chains. Thus, VERL changes first, with the change propagated by concerted evolution, and lysin adapts to match its cognate VERL repeat while conserving its overall three-dimensional structure. The rate and extent of homogenization of repeating motifs may vary between lineages and species (Modi 1993
), and there is evidence that the homogenization of VERL repeats differs between species (unpublished data). This variation in VERL homogenization rates may lead to variation in the selective pressure on lysin among lineages. For example, once lysin-VERL interaction has been optimized, purifying selection may act on lysin until a new VERL repeat type appears and is propagated by concerted evolution. Instead of being the driving force in achieving reproductive isolation (the mismatch of lysin and VERL), the species specificity of abalone fertilization may evolve as a by-product of the way in which these two cognate gamete recognition proteins evolve.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Keywords: abalone
fertilization
likelihood ratio test
lysin
maximum likelihood
molecular adaptation
molecular evolution
positive selection
reinforcement
sperm-egg recognition
2 Address for correspondence and reprints: Ziheng Yang, Galton Laboratory, Department of Biology, University College London, 4 Stephenson Way, London NW1 2HE, United Kingdom. E-mail: z.yang{at}ucl.ac.uk
![]() |
literature cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Biermann, C. H. 1998. The molecular evolution of sperm bindin in six species of sea urchins (Echinodea: Strongylocentrotidae). Mol. Biol. Evol. 15:17611771
Civetta, A., and R. S. Singh. 1998. Sex-related genes, directional selection, and speciation. Mol. Biol. Evol. 15:901909[Abstract]
Dobzhansky, T. 1940. Speciation as a stage in evolutionary divergence. Am. Nat. 74:302321
Elder, J. F. J., and B. J. Turner. 1995. Concerted evolution of repetitive DNA sequences in Eukaryotes. Q. Rev. Biol. 70:297320[ISI][Medline]
Ferris, P. J., C. Pavlovic, S. Fabry, and U. W. Goodenough. 1997. Rapid evolution of sex-related genes in Chlamydomonas. Proc. Natl. Acad. Sci. USA 94:86348639
Goldman, N., and Z. Yang. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11:725736
Hellberg, M. E., and V. D. Vacquier. 1999. Rapid evolution of fertilization selectivity and lysin cDNA sequences in tuguline gastropods. Mol. Biol. Evol. 16:839848[Abstract]
. 2000. Positive selection and propeptide repeats promote rapid interspecific divergence of a gastropod sperm protein. Mol. Biol. Evol. 17:458466
Kresge, N., V. D. Vacquier, and C. D. Stout. 2000a. 1.35 and 2.07 A resolution structures of the red abalone sperm lysin monomer and dimer reveal features involved in receptor binding. Acta Crystallogr. D 56:3441
. 2000b. The high resolution crystal structure of green abalone sperm lysin: implications for species-specific binding of the egg receptor. J. Mol. Biol. 296:12251234
Lee, Y.-H., T. Ota, and V. D. Vacquier. 1995. Positive selection is a general phenomenon in the evolution of abalone sperm lysin. Mol. Biol. Evol. 12:231238[Abstract]
Lee, Y.-H., and V. D. Vacquier. 1995. Evolution and systematics in Haliotidae (Mollusca: Gastropoda): inferences from DNA sequences of sperm lysin. Mar. Biol. 124:267278[ISI]
Li, W.-H. 1997. Molecular evolution. Sinauer, Sunderland, Mass
Lyon, J. D., and V. D. Vacquier. 1999. Interspecies chimeric sperm lysins identify regions mediating species-specific recognition of the abalone egg vitelline envelope. Dev. Biol. 214:151159[ISI][Medline]
Metz, E. C., and S. R. Palumbi. 1996. Positive selection and sequence arrangements generate extensive polymorphism in the gamete recognition protein bindin. Mol. Biol. Evol. 13:397406[Abstract]
Metz, E. C., R. Robles-Sikisaka, and V. D. Vacquier. 1998. Nonsynonymous substitution in abalone sperm fertilization genes exceeds substitution in introns and mitochondrial DNA. Proc. Natl. Acad. Sci. USA 95:1067610681
Modi, W. S. 1993. Heterogeneity in the concerted evolution process of a tandem satellite array in meadow mice (Microtus). J. Mol. Evol. 37:4856[ISI][Medline]
Nei, M., and T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418426[Abstract]
Nielsen, R., and Z. Yang. 1998. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929936
Ohta, T. 1973. Slightly deleterious mutant substitutions in evolution. Nature 246:9698
Palumbi, S. R. 1994. Genetic divergence, reproductive isolation and marine speciation. Annu. Rev. Ecol. Syst. 25:547572[ISI]
Palumbi, S. R., and E. C. Metz. 1991. Strong reproductive isolation between closely related tropical sea urchins (genus Echinometra). Mol. Biol. Evol. 8:227239[Abstract]
Shaw, A., P. A. Fortes, C. D. Stout, and V. D. Vacquier. 1995. Crystal structure and subunit dynamics of the abalone sperm lysin dimer: egg envelopes dissociate dimers, the monomer is the active species. J. Cell Biol. 130:11171125[Abstract]
Shaw, A., D. E. McRee, V. D. Vacquier, and C. D. Stout. 1993. The crystal structure of lysin, a fertilization protein. Science 262:18641867
Swanson, W. J., and V. D. Vacquier. 1995. Extraordinary divergence and positive Darwinian selection in a fusagenic protein coating the acrosomal process of abalone spermatozoa. Proc. Natl. Acad. Sci. USA 92:49574961
. 1998. Concerted evolution in an egg receptor for a rapidly evolving abalone sperm protein. Science 281:710712
Tsaur, S. C., and C.-I. Wu. 1997. Positive selection and the molecular evolution of a gene of male reproduction, Acp26Aa of Drosophila. Mol. Biol. Evol. 14:544549[Abstract]
Vacquier, V. D., and Y.-H. Lee. 1993. Abalone sperm lysin: unusual mode of evolution of a gamete recognition protein. Zygote 1:181196
Vacquier, V. D., W. J. Swanson, E. C. Metz, and C. D. Stout. 1999. Acrosomal proteins of abalone spermatozoa. Adv. Dev. Biochem. 5:4981
Wyckoff, G. J., W. Wang, and C.-I. Wu. 2000. Rapid evolution of male reproductive genes in the descent of man. Nature 403:304309
Yang, Z. 1998. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15:568573[Abstract]
. 1999. Phylogenetic analysis by maximum likelihood (PAML) (http://abacus.gene.ucl.ac.uk/software/paml.html). University College London, London
Yang, Z., S. Kumar, and M. Nei. 1995. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:16411650
Yang, Z., and R. Nielsen. 1998. Synonymous and nonsynonymous rate variation in nuclear genes of mammals. J. Mol. Evol. 46:409418[ISI][Medline]
. 2000. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 17:3243
Yang, Z., R. Nielsen, N. Goldman, and A.-M. K. Pedersen. 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431449