* Department of Biology, Long Island University; and Ornithology Department, American Museum of Natural History
Correspondence: E-mail: lorraine.marsh{at}liu.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: protein structure substitution rate rhodopsin homoplasy extra steps
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
One focus of studies has been the influence of structure on the probability that a specific amino acid residue will occupy a site (Dayhoff, Schwartz, and Orcutt 1978; Mizuguchi and Blundell 2000). A second focus is the manner in which structure influences the path of evolution from ancestor to modern protein. The first approach defines the likelihood that proteins fold similarly and has been used to assess homology of protein structures. The second approach incorporates data on the rate of change as well as on the specific amino acid changes to develop models for inference of phylogenetic trees. A general problem in recovering accurate phylogenies involves distinguishing historical signal from noise. This problem is compounded by variation in rates of evolution among sites in sequences. Identifying rapidly evolving sites, e.g., through protein structural data, may be necessary to infer accurate higher level phylogenies (Naylor and Brown 1997).
Initial investigations of variation in rates among sites in protein-coding genes focused at the nucleotide level, on variation among the three codon positions (Li 1993; Pamilo and Bianchi 1993). Codon-based methods recognized the difference in rates between silent and replacement substitutions (Goldman and Yang 1994; Muse and Gaut 1994). Recognition of variation at sites along the length of the sequence led to descriptions of the distribution of rates of change among sites; the most common currently used is the gamma distribution (Yang 1994, 1996). Continuing problems in recovering signal have led researchers to develop more realistic models of evolution. For proteins, these models address aspects of protein structure, by recognizing the differences in selection and the spatial correlation among residue sites (Fares et al. 2002; Schadt and Lange 2002; Soyer et al. 2002; Suzuki 2004) or by incorporating structurally correlated rates into amino acid replacement matrices (Koshi and Goldstein 1998; Kosiol, Goldman, and Buttimore 2004). Most models are codon based, work within a maximum-likelihood or Bayesian methodology, and result in categories or classes of amino acids, based on rates or substitution matrices.
We were interested in working within a conceptual framework for protein evolution that incorporated specific information from protein structure for classifying residue sites. To accomplish this, we analyzed sequences of rhodopsin for 62 vertebrate species encompassing five classes. Our goals were twofold; first, to develop a model for the family of rhodopsin and rhodopsin-like proteins, one of the family of G protein-coupled receptors. These receptors have been grouped based on their structure; all have seven TM helices that comprise a bundle of TM segments. Secondly, we wanted to increase our knowledge of the role of structure on the evolution of TM helical proteins. We worked within a maximum parsimony (MP) framework for this stage of our research. The minimal assumptions of MP (Swofford et al. 1996) were advantageous for examining protein evolution to develop our initial model. In addition, although MP may underestimate the number of changes at any one site, this is beneficial in providing a conservative estimate of homoplasy (similarity not derived from a common ancestor). Highly variable amino acid residues of rhodopsin were identified during an initial phylogenetic analysis. We tested the correlation of residue differences and residue homoplasy with structural characteristics of the rhodopsin protein to find protein structural features associated with high rates of change in this tree and identified a structural class of residues enriched for residues with a high rate of evolution. Incorporating this knowledge into a MP analysis led to a more accurate branching pattern in the rhodopsin tree.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Results from the initial analysis were used to assess character quality and define residues with high rates of change. The evaluation of data quality within a phylogenetic context typically involves the comparison of character consistency. This is measured with the consistency index (CI) or retention index (RI) or some variation of these indices. The CI is computed for any character by dividing the minimum number (m) of changes (the number of character states 1) by the number of steps (s) on the tree (CI = m/s). These indices are scaled to vary between 0 and 1 and can be problematic for data comparison. Thus, one homoplasious step will reduce the CI from 1.0 to 0.5 for two state characters and from 1.0 to 0.75 for four state characters. In addition, RIs can be compromised by autapomorphic character states (Naylor and Kraus 1995). To circumvent these problems, we used an additional measure of quality related to the CI, the number of extra steps for each character (Griffiths et al. 2004). This is computed by subtracting the minimum number of steps from the number of steps on the phylogenetic tree (s m). This index has the advantage of being directly comparable for characters with different numbers of states.
Additional analyses were performed using this information. Two classes of residues were defined to test the effect on phylogenetic inference of tree-defined homoplastic residues versus structurally defined residues with high rates of evolution. In the analyses, these classes of residues were differentially weighted to reduce the influence of these residues on the length and topology of the inferred tree (Swofford et al. 1996). Definition of the first class, tree-defined homoplastic residues (residue sites with extra steps) was dependent on the configuration of the initial tree. This class included three partitions of "extra step" residues, those with 13, those with 48, and those with more than 8 extra steps. Phylogenies were inferred with the most homoplastic partition excluded and then with each partition differentially weighted. The second class of residues was based on protein structural features (see below) and was independent of the initial tree. To test the sensitivity of the analyses to character weights, four runs were performed in which classes of residues were excluded or downweighted (1/2, 1/5, and 1/10 the weight of the other residues).
We used nodal resolution and congruence with well-accepted vertebrate phylogenies (Hillis 1995; Chen, Bonillo, and Lecointre 2003) to assess the results of these analyses. If more than one most parsimonious tree (MPT) was found, then a strict consensus of the MPTs was used for comparisons.
Rhodopsin Structural Data
We wanted to identify features of the protein that might be associated with altered rates of evolution. Rhodopsin structural features (secondary structure and residue accessibility) were extracted using the program DSSP (Kabsch and Sander 1983) with coordinates from the Protein Data Bank (PDB) rhodopsin structure file 1F88. Twelve residues are missing in this structure and were excluded from analyses. Structure characters were pooled to create classes: sheet and extended structures were grouped as "ß-structure," and disordered, turn, and reverse helix were grouped as "loop." A major structural feature of rhodopsin is seven TM domains comprised of -helices (TM helix). For analytical purposes, residues classified as TM helical were limited to the major helical domains that span the membrane. The small, amphipathic helix following helix 7 was excluded from the TM designation because it lies in the cytoplasm. Residues involved in helix kinks were categorized as TM helical if they did not significantly disrupt the periodicity of the helix. Amino acid residues on the surface of the protein have greater freedom to evolve. Surface exposure is often quantified using a measure of accessibility of a residue side chain to a solvent molecule. Access values correspond to residue surface area that could contact a probe sphere rolled over the surface of the protein. Residue access was calculated using a water molecule of 1.4-Å radius as a probe, though residues in lipid environments may experience a slightly different accessibility to solvent. A majority of residues had significant access.
Correlations of Residue Variability and Homoplasy with Structure
We performed several tests to find structural correlates of high evolutionary rate. The correlation of the empirically defined residue differences (where residue differences equals the total number of amino acid states observed at a site minus 1) and extra steps with structural characteristics was determined to test various models for structural influence on evolution. We used nonparametric analyses (Wilcoxon test, Spearman correlation) because neither amino acid differences nor accessibility are normally distributed. The number of residue extra steps and amino acid differences can be fit to a Poisson distribution for the evolutionary distances studied, and accessibility is distributed according to geometric constraints of the folded protein (Lee and Richards 1971). The two-sample rank-sum Wilcoxon test was used to test if groups of sites exhibited significant differences in rates of evolution. Spearman's rank correlation coefficient, , was calculated to test the correlation between the accessibility of residues and variability and homoplasy. For analysis of residue access, residues missing in the 1F88 structure of rhodopsin were not considered. For correlations with extra steps, sites with only a single state were excluded because they were incapable of exhibiting extra steps. The predictive power of structural correlations depends on the distribution of rates. Rates often exhibit a distribution that can be described by the gamma function. The gamma shape parameter
determines the distribution of sites evolving at different rates. The
shape parameter was calculated independently for TM and non-TM residues (Gu and Zhang 1997).
Fourier Analysis of Evolutionary Patterns
We wished to analyze categories of residues on one side of a rhodopsin helix and distinguish them from residues on the other face of the helix. Fourier analysis is used to extract a periodic signal from noise (Walker 1999) and is capable of classifying residues on one side of a helix. To determine the periodic distribution of sites with a high evolution rate within rhodopsin, a standard Fourier analysis with a scanning window of 18 residues was used to detect patterns of amino acid substitution (Lio and Vannucci 2000). The use of appropriately scaled scanning windows has been shown effective in evolutionary feature detection in other contexts (Fares et al. 2002). Scanning analysis detected features indicating asymmetry on helices without defining the helix face involved. As a second step, the orientation of residue sites relative to the protein surface was determined from the angular difference (d) of the Fourier phase component for accessibility and amino acid differences. For this second analysis, windows of 28 residues centered in the middle of each TM domain were analyzed (Eisenberg, Weiss, and Terwillinger 1984). An index was derived from the Fourier phase components that reflected the relative exposure of a residue which might alter its evolutionary properties.
![]() |
Homology Modeling
The structure of bovine rhodopsin (Palczewski et al. 2000) was used to predict the accessibility of residues for sequences of all taxa in this study. One concern is that structural details of rhodopsin (i.e., access values) might change over the evolutionary time span of the taxa represented in this study. To test the stability of accessibility predictions and to test that the bovine structure provided a good model for distant rhodopsins, a homology model representing the predicted structure of Brachydanio rerio (zebra fish) rhodopsin was prepared. Homology modeling involves modeling a target protein on a template whose structure is known (Sanchez and Sali 1997) and can be considered an estimator for protein parameters. Modeling was accomplished using the program Modeller6v1 (Sanchez and Sali 1997) with a bovine rhodopsin structure template (1F88, PDB). Regions of the model corresponding to gaps in the rhodopsin structure template were not included in the zebra fish rhodopsin model.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
There is not much resolution within mammals. Rabbit is placed as the basal taxon. An artiodactyls-cetacean (whales, pig, sheep, and cow) clade is recovered; however, pigs group with the cetaceans. Finally, manatees cluster with primates.
Correlation of Variability and Homoplasy with Structure
We were interested in a model of evolution in which various types of information such as site amino acid differences, site homoplasy, and site structure (access, secondary structure, and environment) contributed to phylogenetic inference. Correlations between these observed features were studied to improve predictions of rate for individual sites. Secondary structure type (ß-structure, -helix, and turn) did not correlate with amino acid substitution differences or extra steps (table 2). TM regions were enriched for extra steps but did not have a significant increase in levels of residue differences. The rate-shape parameters indicated that the rate of site evolution was more disperse in TM regions than in non-TM regions, supporting the idea that the environment for protein evolution in the membrane might be distinct. Accessible residues in TM regions (the receptor "spines," facing the lipid bilayer) exhibited a higher proportion of differences than surface residues of the protein as a whole. Both residue differences and extra steps were positively correlated with residue access in TM domains.
|
Localization of Evolutionary Patterns
Accessible residues in TM regions exhibited a high rate of evolution. We wished to determine how uniformly sites with a high rate were distributed within TM domains. Pattern analysis was used to study this distribution. A series of residues exhibiting amino acid differences or homoplasy was subjected to feature detection using a Fourier process to quantify the presence of repeating extra-step patterns (Walker 1999). The seven TM domains of rhodopsin are predominantly -helix with a periodicity of 3.6 residues. Analysis of the amino acid difference signal (fig. 2) indicated that TM regions exhibited a pattern consistent with that formed by substitutions biased to one face of an
-helix. Signal threshold was defined as being at least half of the maximal peak value. Helices 1, 3, 4, 5, and 6 produced strong signals. Helices 4, 5, and 6 gave strong extra-step signals with shorter signals in helices 1 and 7. This analysis did not indicate whether residues with amino acid differences were exposed or buried but localized regions with asymmetry in the distribution of high-rate residues. Variability was broadly distributed over rhodopsin TM domains and not limited to a few helices.
|
|
Final Phylogenetic Analyses
Analyses were performed to determine the effects of excluding and downweighting rapidly evolving residues. Excluding or downweighting the class of homoplastic residues in the first series of analyses resulted in trees similar to those in which no residue sites were excluded (fig. 1).
The analyses downweighting or excluding the structurally defined category of residues (SPINE characters) produced two different results. When the spine characters were given half the weight of the other residues, the phylogeny again was similar to that produced with equal weighting of all residues (fig. 1). When the SPINE characters were downweighted further (1/5 or 1/10) or excluded, a different consensus tree was produced (fig. 3). Although relationships within the teleost clade were somewhat less resolved, overall, this tree was more resolved than the initial analysis (56 resolved nodes, compared to 52 for the initial analysis).
|
Relationships within the tetrapod clade changed. The lizard and lampreys were still sister taxa, similar to the results including all residues. However, the frog and salamander species were now the basal group among the tetrapods included in this analysis, a generally accepted relationship within tetrapods (Cotton and Page 2002). This was followed by the lizard-lamprey clade, with the alligator-chicken species as sister taxa to mammals.
Relationships within mammals were more resolved. The rodent and carnivore species each formed monophyletic groups. Finally, rabbit was not basal to all mammal species but clustered within the rodent-primate clade (Euarchontoglires; Amrine-Madsen et al. 2003).
Homology Modeling
Bovine rhodopsin was used to predict the accessibility of residues for all taxa for residue classification purposes. To test that the bovine structure provided a good model for distant rhodopsins, the structure of zebra fish rhodopsin was predicted using homology modeling. Model residue positions are determined by energetics and constraints provided by the template. Modeling is generally successful if residue identity is >40% (Sanchez and Sali 1997). In the present instance, identity of rhodopsin from cow and Brachydanio rerio is 80.5%. For 1,336 backbone atoms, the 1F88/model root mean square deviation was 0.34 Å. This represents a deviation of less than one atom van der Waals radius between the fish predicted model and the bovine structure backbone, suggesting that the zebra fish protein sequence can be accommodated well within the framework of the bovine protein.
Figure 4 shows a scatterplot of accessibility values from the two branches of the tree. Accessibility of residues of the two species was highly correlated. Thus, we conclude that for rhodopsin, using the residue accessibility derived from one taxon was probably suitable for labeling residues of all taxa from fish to mammals. For trees representing a greater evolutionary distance, different branches might require separate calculations to accurately apply our method.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Structural Correlates of Amino Acid Rate of Substitution
Though mutation rates vary for many reasons, one of the most pervasive constraints is the limit imposed by protein folding and stability. Properties of proteins such as variability of residues vary within the sequence and are dependent on placement of secondary structure elements (Mizuguchi and Blundell 2000). The long helical bundles of G protein-coupled receptors have often been used to study the effects of protein structure on residue features. Residue hydrophobicity (Eisenberg, Weiss, and Terwillinger 1984; Lio and Vannucci 2000) and residue substitution (Donnelly et al. 1993) exhibit a periodicity of 3.6, characteristic of -helix, detectable by Fourier analysis for several membrane proteins. We found no differences in rate among the different types of secondary structure in rhodopsin. However, amino acid differences and homoplasy correlated with residue access in TM domains. In a related analysis we showed that within TM domains, the amino acid differences series was asymmetrically distributed on the faces of TM helices 1, 3, 4, 5, and 6 and to a lesser extent in helices 2 and 7 as well. Homoplasious residues concentrated on the lipid face of helices 4, 5, and 6 with shorter segments in helices 1 and 7. The regions enriched for differences and homoplasy were similar but not identical. On average, residues with either high amino acid differences or high homoplasy were more likely to face the lipid bilayer. Because high variability of lipid-exposed residues has been observed in several diverse membrane proteins (Goldman, Thorne, and Jones 1998; Stevens and Arkin 2001) and because lipid exposure was a conserved site feature of rhodopsins in our sample, heterotachy (Lopez, Casane, and Philippe 2002), at least as a major evolutionary influence, seems a reduced concern. Both the fish and mammalian taxa independently showed increased site differences and homoplasy for accessible TM residues in rhodopsin (data not shown). There was also correlation of access with amino acid differences in aqueous-exposed regions, but the effect was weaker than in TM regions and was not observed for homoplasy.
Choosing a Rate Class of Residues Based on Structure
Ideally, individual residues could be assigned rates based on information including known functionality, structural constraints of the protein and others of similar structure, and observed substitution data. In the present case, we have limited analysis to identifying a single class of residues with higher than average rates. The underlying assumption is that removal of an outlying class of residues will increase the fit between model and data, improving phylogenetic inference. The SPINE characters had over three times the evolutionary rate of rhodopsin as a whole. The high rate dispersion within the TM domain may have contributed to our ability to select a high-rate group.
We based our analysis on residue access of bovine rhodopsin. Because the bovine rhodopsin structure is known, this is a conservative approach, based on experimental observation. One concern was that structural details of rhodopsin (e.g., access values) might have diverged in the taxa studied. Access values of a homology model of zebra fish supported the use of bovine rhodopsin as the template to select a class of high-rate residues for all taxa. Alternatives would be to estimate accessibility of site residue side chains of homology models of each taxon or to use a Fourier-based definition of accessibility determined by amino acid backbone position rather than side chain. We chose to include highly accessible residues from all TM domains in the excluded SPINE class though support for inclusion of helices 2 and 7 was not unambiguous. Helices 2 and 7 may be subject to distinct evolutionary selection forces or this result may reflect sampling variation.
Different Paths to Residue Variability and Homoplasy
Residue variability, a measure that roughly corresponds to states at residue sites, has been widely used to predict residue access in membrane proteins whose structure is unknown. It may be that homoplasy or extra steps, a related but distinct value, may also be helpful in predicting protein structure and function. Variability and extra steps can vary independently, though they were highly correlated in rhodopsin (table 2). Residue 83, a nonsurface residue, is a special case. In our phylogenies, two residues, Asp83 and Asn83, alternate in several lineages, producing only two states but high extra steps. It is known that these two residues confer very different signaling properties on rhodopsin (Breikers et al. 2001). The Asn83 form produces a rhodopsin that is blueshifted relative to the Asp83 form, which may be more adaptive in aquatic environments. One model for the high extra steps at site 83 is recurrent selection reflecting organism vision needs. Thus, the pool of residues with high extra steps in rhodopsin may contain two or more distinct groups of residues with distinct roles in structure. One group may interact very little with other residues and play only a minimal role in function, and another group may play a key role in function and be subject to selection for aspects of vision. Our focus has been on residue substitutions and extra steps that are strongly influenced by protein structure folding requirements, but extra steps for other residues may primarily be determined by their influence on protein function and organism fitness. Residue 83 lies in helix 2, a TM domain that failed to exhibit a clear bias of site differences toward surface residues. It is possible that conflicting evolutionary forces muted the signal we were studying. Certainly, protein structure is only one of the several forces acting on the evolution of rhodopsin.
Integrating Sequence and Structural Data in Models for Evolution and Phylogenetic Reconstruction
In this study, we used protein structure data to improve phylogenetic inference. A probabilistic model for dependency of a phylogenetic inference on available data is shown in figure 5. Much work has focused on improving estimation of rate parameters using sequence data (Dimmic, Mindell, and Goldstein 2000; Soyer et al. 2002). A complementary approach is to use other observable data to improve parameter and tree inference. Our model (fig. 5) explicitly indicates the dependency of the residue rate-change parameter on protein structure and incorporates structure on an equal basis with sequence. One approach to using protein structural data in tree building would be to create a full conditional Bayesian model including both sequence data and protein structural data. Development of this model will be deferred for a future publication. A simpler approach to using structural data is to collapse dependencies (shown as a dashed line in fig. 5) and use structural data directly for probabilistic classification of residues into rate classes, which can then be differentially weighted during phylogenetic inference, that is, the approach presented here. Observable structural data from X-ray crystallography are used to probabilistically categorize sites permitting exclusion of a site class with a probable rate inconsistent with the underlying model. Fully utilizing the data available from both taxa sequences and structural information produced a more consistent branching pattern.
|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Amrine-Madsen, H., K. Koepfli, R. K. Wayne, and M. S. Springer. 2003. A new phylogenetic marker, apolipoprotein B, provides compelling evidence for eutherian relationships. Mol. Phylogenet. Evol. 28:225240.[CrossRef][ISI][Medline]
Bowie, J. U., J. F. Reidhaar-Olson, W. A. Lim, and R. Sauer. 1990. Deciphering the message in protein sequences: tolerance to amino acid substitutions. Science 247:13061310.[ISI][Medline]
Breikers, G., P. H. Bovee-Geurts, G. L. DeCaluwe, and W. J. DeGrip. 2001. A structural role for Asp83 in the photoactivation of rhodopsin. Biol. Chem. 382:12631270.[CrossRef][ISI][Medline]
Chen, W., C. Bonillo, and G. Lecointre. 2003. Repeatability of clades as a criterion of reliability: a case study for molecular phylogeny of Acanthomorpha (Telostei) with larger number of taxa. Mol. Phylogenet. Evol. 26:262288.[CrossRef][ISI][Medline]
Cotton, J. A., and R. D. Page. 2002. Going nuclear: gene family evolution and vertebrate phylogeny reconciled. Proc. R. Soc. Lond. B Biol. Sci. 269:15551561.[CrossRef][ISI][Medline]
Dayhoff, M. O., R. M. Schwartz, and B. C. Orcutt. 1978. A model of evolutionary change in proteins. Pp. 345352 in M. O. Dayhoff, ed. Atlas of protein sequences and structure, Vol. 5, Suppl. 3. National Biomedical Research Foundation, Washington, D. C.
Dimmic, M. W., D. P. Mindell, and R. A. Goldstein. 2000. Modeling evolution at the protein level using an adjustable amino acid fitness model. Pac. Symp. Biocomput. 2000:1829.
Donnelly, D., J. P. Overington, S. V. Ruffle, J. H. A. Nugent, and T. L. Blundell. 1993. Modeling helical transmembrane domains: the calculation and use of substitution tables for lipid-facing residues. Protein Sci. 2:5570.
Eisenberg, D., R. M. Weiss, and T. C. Terwillinger. 1984. The hydrophobic moment detects periodicity in protein hydrophobicity. Proc. Natl. Acad. Sci. USA 81:140144.[Abstract]
Elmerot, C., U. Arnason, T. Gojobori, and A. Janke. 2002. The mitochondrial genome of the pufferfish, Fugu rubripes, and the ordinal telostean relationships. Gene 295:163172.[CrossRef][ISI][Medline]
Fares, M. A., S. F. Elana, J. Ortiz, A. Moya, and E. Barrio. 2002. A sliding window-based method to detect selective constraints in protein-coding genes and its application to RNA viruses. J. Mol. Evol. 55:509521.[CrossRef][ISI][Medline]
Goldman, N., J. L. Thorne, and D. T. Jones. 1998. Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149:445458.
Goldman, N., and Z. Yang. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11:725736.
Griffiths, C. S., G. F. Barrowclough, J. G. Groth, and L. Mertz. 2004. Phylogeny of the Falconidae (Aves): a comparison of the efficacy of morphological, mitochondrial, and nuclear data. Mol. Phylogenet. Evol. 32:100109.
Gu, X., and J. Zhang. 1997. A simple method for estimating the parameter of substitution rate variation among sites. Mol. Biol. Evol. 14:11061113.[Abstract]
Hillis, D. M. 1995. Approaches for assessing phylogenetic accuracy. Syst. Biol. 44:316.[ISI]
Kabsch, W. M., and C. Sander. 1983. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:25772637.[CrossRef][ISI][Medline]
Koshi, J. M., and R. A. Goldstein. 1998. Mathematical models of natural amino acid site mutations. Proteins 32:289295.[CrossRef][ISI][Medline]
Kosiol, C., N. Goldman, and N. H. Buttimore. 2004. A new criterion and method for amino acid classification. J. Theor. Biol. 228:97106.[CrossRef][ISI][Medline]
Lee, B., and F. M. Richards. 1971. The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55:379400.[ISI][Medline]
Li, W.-H. 1993. Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol. 36:9699.[ISI][Medline]
Lio, P., and M. Vannucci. 2000. Wavelet change-point prediction of transmembrane proteins. Bioinformatics 16:376382.[Abstract]
Lopez, P., D. Casane, and H. Philippe. 2002. Heterotachy, an important process of protein evolution. Mol. Biol. Evol. 19:17.
Miya, M., H. Takeshima, H. Endo et al. (12 co-authors). 2003. Major patterns of higher teleostean phylogenies; a new perspective based on 100 complete mitochondrial DNA sequences. Mol. Phylogenet. Evol. 26:121138.[CrossRef][ISI][Medline]
Mizuguchi, K., and T. Blundell. 2000. Analysis of conservation and substitution of secondary structure elements within protein superfamilies. Bioinformatics 16:11111119.[Abstract]
Muse, S. V., and B. S. Gaut. 1994. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol. Biol. Evol. 11:715734.
Naylor, G., and F. Kraus. 1995. The relationship between S and M and the retention index. Syst. Biol. 44:559562.[ISI]
Naylor, G. J. P., and W. M. Brown. 1997. Structural biology and phylogenetic estimation. Nature 388:528530.
Ng, P. C., J. G. Henikoff, and D. Henikoff. 2000. PHAT: a transmembrane-specific substitution matrix. Bioinformatics 16:760766.[Abstract]
Palczewski, K., T. Kumasaka, T. Hori et al. (12 co-authors). 2000. Crystal structure of rhodopsin: a G protein-coupled receptor. Science 289:739745.
Pamilo, P., and N. O. Bianchi. 1993. Evolution of the ZFX and ZFY genes: rates and interdependence between the genes. Mol. Biol. Evol. 10:271281.[Abstract]
Sanchez, R., and A. Sali. 1997. Advances in comparative protein-structure modeling. Curr. Opin. Struct. Biol. 7:206214.[CrossRef][ISI][Medline]
Schadt, E., and K. Lange. 2002. Codon and rate variation models in molecular phylogeny. Mol. Biol. Evol. 19:15341539.
Simmons, M. P., and M. Miya. 2004. Efficiently resolving the basal clades of a phylogenetic tree using Bayesian and parsimony approaches: a case study using mitogenomic data from 100 teleost fishes. Mol. Phylogenet. Evol. 31:351362.[CrossRef][ISI][Medline]
Soyer, O., M. W. Dimmic, R. R. Neubig, and R. A. Goldstein. 2002. Using evolutionary methods to study G-protein coupled receptors. Pac. Symp. Biocomput. 2002:625637.
Stevens, T. J., and I. T. Arkin. 2001. Substitution rates in helical transmembrane proteins. Protein Sci. 10:25072517.
Suzuki, Y. 2004. Three-dimensional window analysis for detecting positive selection at structural regions of proteins. Mol. Biol. Evol. 21:23522359.
Swofford, D. L. 1999. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer Associates, Sunderland, Mass.
Swofford, D. L., G. J. Olsen, P. J. Waddell, and D. M. Hillis. 1996. Phylogenetic inference. Pp. 407514 in D. M. Hillis, C. Moritz, and B. K. Mable, eds. Molecular Systematics, Sinauer Associates, Sunderland, Mass.
Walker, J. S. 1999. A primer on wavelets and their scientific applications. CRC Press, LLC, New York, N.Y.
Yang, Z. 1994. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39:306314.[CrossRef][ISI][Medline]
. 1996. Among-site variation and its impact on phylogenetic analysis. Trends Ecol. Evol. 11:367372.[CrossRef][ISI]
|