Secondary Structure of Mitochondrial 12S rRNA Among Fish and Its Phylogenetic Applications

Hurng-Yi Wang and Sin-Che Lee

Department of Biology, National Taiwan Normal University, Taipei,
Laboratory of Molecular Systematics of Fishes, Institute of Zoology, Academia Sinica, Taipei, Taiwan


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The complete 12S ribosomal RNA(rRNA) sequences from 23 gobioid species and nine diverse assortments of other fish species were employed to establish a core secondary structure model for fish 12S rRNA. Of the 43 stems recognized, 41 were supported by at least some compensatory evidence among vertebrates. The rates of nucleotide substitution were lower in stems than in loops. This may produce less phylogenetic information in stems when recently diverged taxa are compared. An analysis of compensatory substitution shows that the percentage of covariation is 68%, and the weighting factor for phylogenetic analyses to account for the dependence of mutations should be 0.66. Different stem-loop weighting schemes applied to the analyses of phylogenetic relationships of the Gobioidei indicate that down-weighting paired regions because of nonindependence could not improve the present phylogenetic analysis. A biased nucleotide composition (adenine% [A%] > thymine% [T%], cytosine% [C%] > guanine% [G%]) in the loop regions was also observed in the mammalian counterpart. The excess of A and C in the loop regions may be because of the asymmetric mechanism of mtDNA replication, which leads to the spontaneous deamination of C and A. This process may also be responsible for a transition-transversion bias and the patterns of nucleotide substitutions in both stems and loops.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Sequences of the 12S ribosomal RNA (rRNA) gene have been widely used in phylogenetic studies among vertebrates (Douzery and Catzeflis 1995Citation ; Kjer 1995Citation ; Lavergne et al. 1996Citation ; Ledje and Arnason 1996Citation ; Richards and Moore 1996Citation ; Springer and Douzery 1996Citation ; Gatesy et al. 1997Citation ; Montgelard, Catzeflis, and Douzery 1997Citation ; Simons and Mayden 1998Citation ). Because rRNA plays a primary role in protein synthesis (Dahlberg 1989Citation ), its function is largely determined by structure (Noller 1984Citation ). The structures of rRNA among different organisms are more highly conserved than are the sequences themselves (Springer and Douzery 1996Citation ). It is, therefore, of particular importance that the structure is taken into account when rRNA genes are aligned, especially when the alignments are intended for phylogenetic analyses, because the phylogenetic inferences require comparisons of homologous characters of different sequences.

In addition, the structures of rRNAs are maintained by Watson-Crick base pairing interactions in stem regions, which obviously violate the assumption of nucleotide independence for phylogenetic analyses. The effectiveness of stems for phylogenetic analyses has been discussed in previous studies (Wheeler and Honeycutt 1988Citation ; Smith 1989Citation ; Dixon and Hillis 1993Citation ; Morrison and Ellis 1997Citation ; Mugridge et al. 2000Citation ) and contrary conclusions have been reached. Recognizing this problem, several authors have proposed weighting schemes to account for compensatory substitutions in rRNA genes (Wheeler and Honeycutt 1988Citation ; Dixon and Hillis 1993Citation ; Springer, Hollar, and Burk 1995Citation ). This work cannot be done unless a reliable secondary structure is available.

In the case of rRNAs, comparative sequence analysis has played an important role in establishing secondary structure models because of the difficulty of X-ray crystallography studies on these large RNA species. Thus, rather detailed structures for 5S-, 16S-, and 23S-like rRNAs have now been inferred based primarily on comparative sequence analyses (Gutell et al. 1985Citation ; Ledje and Arnason 1996Citation ; Springer and Douzery 1996Citation ). The secondary structure of the 12S rRNA of fishes was proposed by Peer et al. (1994)Citation . Their model reveals a set of core pairing interactions common to 16S-like rRNAs. However, that study only included one sequence (Cyprinus carpio) without giving any compensatory evidence for base pairing interactions; so a further revision of the fish 12S rRNA core model is necessary.

Most evidence for compensatory changes was found by comparing distantly related taxa. However, such covariant mutation can also occur within closely related groups (Ortí et al. 1996Citation ). In the present study, we compile a large number of gobioid 12S rRNA sequences, and those from other fish groups (table 1 ), in order to evaluate and improve the model of 12S rRNA set up by Peer et al. (1994)Citation for establishing a core set of fish base pairing interactions.


View this table:
[in this window]
[in a new window]
 
Table 1 Taxa Analyzed in this Study with Their GenBank Accession Numbers Indicated

 
Once the model for 12S rRNA is determined, the effect of structural constraints on phylogenetic analysis within stem regions is considered. We carried out different weighting schemes of stem versus loop to examine the phylogenetic relationships of Gobioidei. We also examine the phylogenetic contents of stem and loop regions in order to quantify the extent of compensatory mutations that occur within 12S rRNA sequences and to discuss the phylogenetic implication of the degree of constraint on compensatory mutations. In addition, the nucleotide composition and the patterns of nucleotide substitutions in stems and loops of gobioid 12S rRNA genes are evaluated based on the secondary structure model we proposed.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Sequences included in this study and their accession numbers are listed in table 1 . DNA sequences were primarily aligned using the default parameters of CLUSTALW (Thompson, Higgins, and Gibson 1994Citation ). Sequence alignments were improved by recognizing stems regions; we started with the C. carpio model (Peer et al. 1994Citation ) and the mammalian 12S rRNA model (Springer and Douzery 1996Citation ) as references. To determine the secondary structure, we followed the method of Springer and Douzery (1996)Citation using two steps to elucidate the boundaries of stems and loops. First, we used the criterion that a potential base pairing must occur in at least 75% of the sequences examined. To find the potential base pairing, we allowed Watson-Crick base pairs, as well as thymine-guanine (T:G)-type interactions. In some cases, the boundaries of stems and loops were hard to detect visually. We used the FoldRNA algorithm (Zuker 1989Citation ) in the University of Wisconsin Genetics Computer Group (GCG) Package, version 9.0 (Genetics Computer Group 1997) to calculate the minimal free energies for folding some potential helical structures. FoldRNA finds a secondary structure of minimum free energy for an RNA molecule based on the published values of stacking and loop destabilizing energies (Zuker 1989Citation ). Second, we searched for compensatory substitutions as evidence to validate (or invalidate) these putative stems. Compensatory mutations often occur in the form of a positional covariance, i.e., changes at one position that covary with changes at a complementary position so as to maintain proper base pairing. In cases where stems were invariantly paired, compensatory evidence has not yet been found. We compare our model with the mammalian model for such evidence because 12S rRNA molecules have been sequenced in almost all major lineages among mammals. Stems were delimited in our proposed model by bilateral bulges of two or more base pairs. Unilateral bulges were allowed in the context of a single stem. A base pair was named by its position in the stem: for example, 14-4 indicates the fourth base of stem 14.

Different phylogenetic methods, including neighbor-joining (NJ), maximum parsimony (MP), and maximum-likelihood (ML) were carried out using PAUP (Swofford 1999Citation ). On the basis of the morphological studies, Hoese and Gill (1993)Citation treated Odontobutis as the sister group for the rest of the Gobioidei fishes. We also added Scomber australasicus (Perciformes) as outgroup for the analysis.

To study the impact of base covariance of stems on the phylogenetic analysis, we performed different loop-stem weightings as 1:1 (assumption of nucleotide independence), 1:0.8 (28S rRNA by Dixon and Hillis 1993Citation ), 1:0.6 (12S rRNA by Springer and Douzery 1996Citation , and this study; see Results), 1:0.5 (strict dependence of nucleotides because of exact compensational changes), 1:0 (loop only), and 0:1 (stem only) when the MP method was performed. Only ratios of 1:1, 0:1, and 1:0 were used for the NJ and ML methods because of the difficulty in incorporating weighting per se into model-based methods (Tillier and Collins 1995Citation ). For each weight used, 500 bootstrap replications were performed for the MP and NJ analyses. We do not carry out bootstrapping for the ML method because of time considerations. We also used the standard-error test of Rzhetsky and Nei (1992)Citation to estimate confidence probabilities for branches on the NJ tree using MEGA (Kumar, Tamura, and Nei 1993Citation ).

Base compositions and nucleotide substitutions in stems and loops were determined using MEGA. The numbers of different types of substitutions were computed with MacClade (Maddison and Maddison 1992Citation ) with reference to the established phylogeny. Substitutions in stem regions were divided into four types according to Springer, Hollar, and Burk (1995)Citation , which was in turn modified from Dixon and Hillis (1993)Citation : Type I (complementary to complementary), Type II (complementary to noncomplementary), Type III (noncomplementary to complementary), and Type IV (noncomplementary to noncomplementary). Each of these is further divided into single and double substitutions, depending on whether one or two substitutions occurred on a particular branch of the tree. The number of substitutions falling into each category listed above was tabulated by MacClade (Maddison and Maddison 1992Citation ). In some cases, the mapped pathways of character evolution were ambiguous. In these instances, each of the alternative pathways was given equal weight in calculating the mean values as recommended by Maddison (1994)Citation .


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Secondary Structure
Our core secondary structure model illustrated in figure 1 refers to Gobiomorphus australis sequences. The model with 38 of the 43 recognizable stems is supported by at least some compensatory substitutions within fishes. Additionally, compensatory evidence can be found for stems 2, 26, and 40 from other vertebrates. Thus, only stems 7 and 41, in the model we propose, are not supported by compensatory changes within vertebrate 12S rRNA sequences. Our model is closer to that of mammals (Springer and Douzery 1996Citation ; Peer et al. 2000Citation ) than to that of C. carpio (Peer et al. 1994Citation ). Further differences between our model and the two others mentioned earlier are explained in the following paragraphs.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 1.—Proposed secondary structure model for fish 12S rRNA illustrated using G. australis. Numbers refer to stems. Hypervariable regions which are indicated by curved lines were removed from phylogenetics analysis

 
Stem 1 is in a 3-bp–long helix in the model proposed by Peer et al. (1994)Citation but a 5-bp–long helix is also recognized in the model proposed by Springer and Douzery (1996)Citation and Peer et al. (2000)Citation . We note that the potential base pair occurs at the downstream 2nd base of 1-3 in most taxa surveyed, with the compensatory substitution observed in Latimeria and Squalus (see Supplementary Material). However, the mutation occurring in the Cyprinidae was not compensatory, and the potential base pair at the downstream 1st base of 1-3 was found only in 60% of the sequences surveyed. Thus, we retained three base pairs in this stem, but noted that the potential 5-bp helix may exist in some fish taxa, such as Scomber, Squalus, Mustelus, Gadus, and Latimeria (see Supplementary Material).

Stem 4 is 8 bp long in the model of Peer et al. (1994, 2000)Citation and Springer and Douzery (1996)Citation . The potential base pairings at positions 4-5 and 4-6 of their model (the downstream 1st and 2nd bases of 4-4 in our model) were not found in most of the fish taxa surveyed. We further split this stem into stems 4 and 5, of four and two base pairs in length, respectively.

The region between 10 and 10' is highly variable where Peer et al. (1994, 2000)Citation recognized only one stem whereas Springer and Douzery (1996)Citation proposed two stems. When compared to the model of Springer and Douzery (1996)Citation , we recognized stems 10 and 11 to be three and five base pairings in length, respectively.

Although stem 11 is 6 bp long according to Gutell (1994)Citation and Springer and Douzery (1996)Citation , the potential base pairing at 11-6 is low in mammals (only 72% of the suborder-order surveyed). Within the fish group, pairings occur in 72% of sequences within gobioids and 50% in other fish groups examined. Because compensatory and noncompensatory mutations coexist within the same family (Margariscus and Cyprinus in Cyprinidae; Oncorhynchus and Salmo in Salmonidae; data not shown), we omitted this base in stem 11 from our core model.

The Cyprinus model constructed by Peer et al. (1994)Citation indicated that stem 12 is located between 10' and 7' and is two base pairings in length. This region was recognized as 2 bp long in Homo (Gutell 1994Citation ) but only 1 bp long in the Springer and Douzery (1996)Citation mammalian model because of the co-occurrence of both compensatory and noncompensatory mutations. Sequences of this stem are almost identical across all taxa examined. The only but compensatory mutation was observed in Latimeria. We, therefore, tentatively retained this 2-bp–long stem.

Like in Peer et al. (1994)Citation , stem 19 in our model is three bases in length, but two bases in the models by Springer and Douzery (1996)Citation and Peer et al. (2000)Citation . The additional base pair (19-3) shows 86% of potential base pairing within Gobioidei and 100% in the other fish taxa surveyed, having a positional covariance from guanine-cytosine (G:C) to adenine-thymine (A:T) found in Gadus and gobioids. We also noticed that the potential base pair is held within Carnivores (Ledje and Arnason 1996Citation ).

As in Peer et al. (1994)Citation , stem 20 in our proposed model is a 6-bp helix against a 7-bp helix in the model by Springer and Douzery (1996)Citation and Peer et al. (2000)Citation . The pairing at 20-7 (the downstream 1st base of our model) occurs in 81% of the sequences investigated in gobioids with overwhelming noncompensatory mutations. In addition, because the position is not paired in sequences of most fishes other than gobioids, we removed it from our model. It is noteworthy that Gutell (1994)Citation proposed an alternative base pairing for this base with stem 23. This is discussed later.

In the model of mammals (Springer and Douzery 1996Citation ), stem 21 is 5 bp in length. This region is highly variable and cannot be aligned reliably. We only recognize a three-base stem because the downstream 1st base was not paired in most of the gobioid sequences. In addition, we also failed to find this stem in Latimeria.

Stem 23 in our proposed model is 2 bp in length, as it was in the models of Peer et al. (1994)Citation and Springer and Douzery (1996)Citation , compared to the 3-bp–long stem in Gutell (1994)Citation . The excess of 1 bp (21-3) in the latter conflicts with 20-7 in the model of Springer and Douzery (1996)Citation with positional covariance supporting both 20-7 and 23-3. This conflict was not observed in fish sequences which are not paired as either 20-7 or 23-3. Consequently, we did not consider this base in our model.

There are four base pairings in stem 27 as proposed by Peer et al. (1994)Citation , but like Peer et al. (2000)Citation we only recognized three. The potential base pairing at 27-4 (the downstream 1st base of our model) occurs in only 50% of the sequences analyzed, and was not recognized in the model proposed by Springer and Douzery (1996)Citation . But consistently, compensatory changes have been found at 27-2 from G:C in fishes to A:T in mammals.

Stem 28 is 4 bp long in our model as it is in the model of Peer et al. (1994)Citation . Evidence for a potential base pair at position 28-1 is missing in mammals (Springer and Douzery 1996Citation ). In fish, the base pairing occurs across all taxa surveyed, and compensatory substitutions are found in Cyprinus, Crossostoma, and Gadus (see Supplementary Material).

As in the model of Springer and Douzery (1996)Citation , stem 34 in our model is 16 bp in length, which is 2 bp longer than in the models of Peer et al. (1994, 2000)Citation and Hickson et al. (1996). The additional two base pairs, 34-15 and 35-16, are almost identical across the fish sequences surveyed, but compensatory base substitutions of 34-15 are found in some invertebrates (fig. 2 of Hickson et al. 1996).



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 2.—Single MP tree based on 12S rRNA sequence. The NJ method produced a tree with identical topology. Numbers above the branches indicate confidence probabilities based on the standard-error test (Rzhetsky and Nei 1992Citation ) of the NJ method and the bootstrap percentages (500 replications) of the MP method, respectively

 
Different ways of pairing between regions 35 and 35' have been proposed (Peer et al. 1994, 2000Citation ; Hickson et al. 1996, Springer and Douzery 1996Citation ). Hickson et al. (1996) recognized two stems and Peer et al. (1994)Citation proposed three stems in this region. But the evidence for compensatory mutation in their model is limited. Springer and Douzery (1996)Citation proposed a two-stem model for this region with 2- and 5-bp-long stems 35 and 36, respectively. Their model is in good agreement with fishes and the evidences for base covariance are found in Gobioidei.

Stem 39 is a 6-bp (Ledje and Arnason 1996Citation ) or an 8-bp (Springer and Douzery 1996Citation ) helix in mammals. However, we only recognize a 3-bp helix in our proposed model. The region between 39 and 39' is highly variable and difficult to align across diversified fish taxa. An additional three potential base pairs were observed in some fish taxa examined as Hickson et al. (1996) proposed, but both compensatory and noncompensatory mutations coexisted without any strong evidence from positional covariance. We tentatively propose a 3-bp helix for stem 39.

Phylogenetic Analysis
After the sequences were aligned, the regions that could not be aligned reliably were deleted, as indicated by solid black lines in figure 1 . In the 12S rRNAs analyzed, there are 443 bp in loops, 455 bp in stems, and 21 bp in bulges, and the 159 bp in loops, 91 bp in stems, and 4 bp in bulges are phylogenetic informative sites. For further analysis, we count bulges as loops rather than as stems, making a total of 464 bp in the loop set.

A single minimum-length tree derived from the MP method showing the relationships among the Gobioidei is given in figure 2 . The neighbor-joining method based on Kimura's two-parameter distance (Kimura 1980Citation ) yields a tree with the same topology. All the nodes in figure 2 are supported by higher than 50% of both bootstrap replications and the standard-error test, and all but three nodes are supported at the 90% level by at least one of these methods. This result is in good agreement with both traditional (Hoese and Gill 1993Citation ; Pezold 1993Citation , and references therein) and molecular phylogenetical interpretations (Wang et al. 2001Citation ).

When stems and loops were analyzed separately, the relationships differed dramatically from those shown in figure 2 . A single MP tree based on stems resolved only nodes C, E, and F with low bootstrapping values. The minimum-length tree of loop regions generated by the MP method, in turn, resolved additional three nodes, A, D, and G with the collapse of B and C (table 2 ). The relationships within each supported node are identical with those in figure 2 . The results of different stem-loop weighting schemes generated by the MP method were also given in table 2 , in which increased stem weighting results in the tree topology resembling that in figure 2 . Although the results from the three tree-construction methods differed from each other, the general trend for the loop regions to produce a more constructive topology is the same (table 2 ). In addition, although the loop regions formed more nodes than did stems, stem regions showed better performance in resolving more diverged groups, e.g., node C (table 2 ).


View this table:
[in this window]
[in a new window]
 
Table 2 Summary of Different Methods, (A) MP, (B) NJ, and (C) ML, Supporting Major Nodes

 
Sequence Compositions and Nucleotide Substitutions
Nucleotide compositions for the fish 12S rRNA gene and its mammalian counterpart are given in table 3 . The nucleotide frequencies are quite uniform across the broad range of taxa examined. In addition, the base composition of 12S rRNA was slightly biased, having average nucleotide frequencies of 30.3%, 21.2%, 25.6%, and 22.9%, respectively, for A, T, C, and G. In loop regions, the highest-frequency nucleotide is A (40.9%), with the rest in the decreasing order of C (23%), T (21%), and G (15.1%). The strand asymmetric base composition with a trend of A% > T% and C% > G% is clearly observed. The base composition in stem regions is much more even than that of loop regions: e.g., the proportion of A (20.2%) is close to that of T (21.9%), and the proportion of C (27.9%) is close to that of G (29.9%). Compared with the mammalian 12S rRNA gene, lower A% and higher G% are consistently observed both in stem and loop regions (table 3 ).


View this table:
[in this window]
[in a new window]
 
Table 3 Nucleotide Composition of the 12S rRNA Gene

 
The relative rates of nucleotide substitutions in loops, stems, and their combinations are given in table 4 . The overall substitution rate in loops is two times higher than that in stems. When transition and transversion are considered separately, the transition rate is 1.6 times higher in loops than in stems, whereas the transversion rate is sixfold greater. There clearly is a transition-to-transversion ratio (TS/TV) bias in both stems and loops, being 5.57 in stems and 1.37 in loops. Although there is an apparent TS/TV bias, not all transitions are equally alike. The rate of A–G transition occurs comparably in both stems and loops, but the C–T transition is twofold greater in loops than in stems. Thus, C–T substitutions are primarily responsible for the TS/TV bias in loops.


View this table:
[in this window]
[in a new window]
 
Table 4 Number of Different Types of Substitutionsa

 
The numbers of substitutions that fall into the designated classes (Types I–IV) in each stem are given in table 5 . The expected frequencies for different classes, assuming complete independence, were calculated following Dixon and Hillis (1993)Citation . The numbers of substitutions shown in table 5 vary considerably among stems, e.g., the 4-bp–long stem 38 has 22 (9 x 2 + 2 + 1.75 + 0.25) nucleotide substitutions, whereas the 12-bp stem 29 is absent. Among substitutions that maintain or restore pairing, we have included both single- and double-base substitutions in Type I but only single-base substitutions in Type III, which are apparently higher than that expected by chance. Significantly fewer substitutions that destroy pairing or maintain nonpairing, including both single- and double-base substitutions in Type II and the single-base substitution in Type IV, were observed than expected. The numbers of double-base substitutions in Types III and IV are 1.17 and 0.83, respectively, showing no significant differences between the observed and expected numbers. The observed number of substitutions which maintain or restore base pairing is 231.7, which is 68% on a scale of 0%–100%, where 0% represents complete independence and 100% represents complete dependence. The weighting for stem regions following the method of Dixon and Hillis (1993)Citation is estimated to be 0.66.


View this table:
[in this window]
[in a new window]
 
Table 5 Number of Substitutions per Stem

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Comparative sequence analysis, using potential base pairing and positional covariance, has proven to be one of the most useful tools for elucidating secondary structures. The comparative analysis in this study suggests some changes in the model of Peer et al. (1994)Citation toward a close resemblance with the mammalian model. Reliable secondary structure models can provide an essential foundation for a sequence alignment, which is crucial for phylogenetic investigations (Springer and Douzery 1996Citation ; Morrison and Ellis 1997Citation ; Mugridge et al. 2000Citation ). It also has been shown that nonhomologous alignment is the major cause of producing inadequate phylogenies (Morrison and Ellis 1997Citation ). Because the 12S rRNA gene has been widely used for phylogenetic studies, it is particularly important to apply a robust secondary structure in molecular systematics. Therefore, the secondary structure presented here may be useful for further phylogenetic analyses.

Gobioidei Phylogeny
Following the approach of Dixon and Hillis (1993)Citation , the stem of 12S rRNA should be assigned a weight of 0.66, which is very close to that from mammals (0.61; Springer, Hollar, and Burk 1995Citation ) and Serrasalminae (fishes; 0.63; Ortí et al. 1996Citation ). It is worthy of note that the weighting of mammals was produced by comparing distantly related taxa covering the entire class whereas the latter focused on closely related groups, such as intergeneric comparisons. Because the weightings were transformed from the dependence of covariation (Dixon and Hillis 1993Citation ), similar weightings obtained from a wide-ranging data set may represent the functional constraint(s) for maintaining stem pairings similar to those of the same gene across widely ranging taxa.

Wang et al. (2001)Citation have used 12S rRNA to study the phylogeny of Gobioidei. In that study, they did not take the different constraints between stems and loops into consideration. As shown in table 2 , it is clear that loops include more phylogenetic information than do stems and provide most of the resolutions in the 12S rRNA gene, especially when the recent divergent groups were considered. The stem regions alone, in turn, can only resolve more distantly related groups. The small number of informative sites is clearly insufficient to demonstrate relationships among most taxa. Nevertheless, loop regions failed to resolve node B and C. A combination of the two data sets shown in figure 2 produces a better resolution of relationships.

We also assigned different weights from 0.5 to 0.8 to the stem. It is clear that difference in stem weighting can be important in phylogenetic inference (table 2A ). When the stems were assigned a weight of 0.5, the monophyly of node B, Eleotrinae, was not resolved and the bootstrap percentage supporting each node was lower. Because the monophyly of Eleotrinae has been supported by osteological study (Hoese and Gill 1993Citation ), this result is less preferred. As the stem weighting was increased to 0.66 and then to 0.8, the resolution of node B became positive, and the bootstrap percentage supporting each node also increased. In addition, there was no obvious difference between the results of the analysis with all characters equally weighted and the results of the analysis with stem characters assigned a weight of 0.8. Thus, increase in stem weighting produced a result closer to the current phylogenetic interpretations.

Phylogenetic Implications
In contrast to our results, Smith (1989)Citation and Morrison and Ellis (1999)Citation concluded that stem regions could produce more reliable trees than loop regions of the 18S rRNA gene. In addition, when using the 28S rRNA, Mugridge et al. (2000)Citation reported that both pairing and nonpairing regions are equally informative. Three reasons, which are not mutually exclusive, could explain these discrepancies. First, 12S rRNA (<1 kb) contains fewer nucleotides, resulting in fewer informative characters as expected. The 18S and 28S rRNAs, in turn, contain more nucleotides which may reveal more phylogenetic information. Second, the difference in length between different rRNAs may not only contribute to their informative accumulations but also influence their covariation dependency. The percentage of dependence (38%, Dixon and Hillis 1993Citation ) for 28S rRNA is much lower than that of 12S rRNA (78%, Springer, Hollar, and Burk 1995Citation ; 74%, Ortí et al. 1996Citation ; 68%, present study). The finding of Stephan and Kirby (1993)Citation that the number of covariations decreased with physical distance in RNA secondary structures can apply to the above observation. Because 28S rRNA is three times longer than 12S rRNA, the latter is expected to have a higher proportion of covariation, indicating that most stem mutations in 28S rRNA genes lack compensatory substitutions. In other words, the stem regions of 28S rRNA may evolve more freely and independently, which, in turn, accumulate more phylogenetic information than those of 12S rRNA. Therefore, the result of the stems and loops of 28S rRNA being equally informative is not surprising. Third, as shown in table 2 , the degree of divergence among the taxa examined may be influential. Stems gave better results when distantly related taxa were analyzed. In addition, in the study of interordinal relationships within the mammals, Douzery and Catzeflis (1995)Citation showed that the stem as well as loop regions of 12S rRNA could resolve most of the phylogenetic nodes, although bootstrap replications of some nodes are lower than those from stem sets. Thus, 12S rRNA stem regions may be regarded as being more suitable for distantly related organisms rather than for recently diverged species.

Sequence Compositions and Nucleotide Substitutions
There is a clear violation of strand symmetry (Wu and Maeda 1987Citation ) in the loop regions: the intrastrand equalities of A = T and G = C expected at equilibrium are not obeyed (table 2 ). An excess of adenine is also found in other metazoans. Vawter and Brown (1993)Citation found higher percentages of A in unpaired than in the paired regions including those of other vertebrates as well as invertebrates. This is presumably caused by A having the least polarity of the four bases, which results in subsequent hydrophobic interactions with proteins (Gutell et al. 1985Citation ). This hypothesis, however, cannot explain the cause of higher C than G (1.5 times). On the other hand, skewness of nucleotide composition (A% > T%, C% > G% or A% + C% > T% + G%) was also found on the fourfold degenerate position of the mammalian (Reyes et al. 1998Citation ) and intergenic regions of the metazoan (Jermiin, Graur, and Crozier 1995Citation ) mitochondrial genome. Because the fourfold degenerate site and intergenic regions of the mtDNA genome are considered as selection-free or -limited positions (Asakawa et al. 1991Citation ; Perna and Kocher 1995Citation ), the violation of strand symmetry in both mitochondrial 12S rRNA loop regions and intergenic regions may reflect an underlying bias for mutations rather than for functional constraints.

Using 25 complete mammalian mitochondrial genomes to study the nucleotide compositions of all three codon positions and fourfold degenerate sites of H-strand genes, Reyes et al. (1998)Citation found a strong bias toward A and C on the H-strand protein-coding genes and the base composition in fourfold degenerate sites; the numbers of variable sites for each gene are significantly correlated with the duration of the single-stranded state of H-stranded genes during replication. Their main conclusion was that the spontaneous deamination of C and A in the H-strand may be one of the crucial processes for the origin of the asymmetric and biased base composition of mammalian mitochondrial genomes. The same condition is also applicable to the 12S rRNA gene loop regions. Less selective constraint allows the directional mutation pressure of the mtDNA genome to contribute to the biased nucleotide composition in the loop regions of 12S rRNA. In stem regions, in turn, because of the constraint of base pairing (table 5 ), the change of A to T on one side and vice versa on the other side must occur simultaneously. The equilibrium frequency of A should be equal to that of T, and likewise, those of C and G should be equal (table 2 ). Consequently, the strand symmetry in stem regions is the result of the selective pressure of base pairing.

This hypothesis can also elucidate the higher rate of C–T transitions than A–G transitions in the loop regions. Although a higher proportion of A% + G% (53.5%) than C% + T% (46.5%) in loop regions was noticed, the transition between C and T is much more profound than that between A and G. The same substitution patterns were also observed in the mammalian counterpart (Douzery and Catzeflis 1995Citation ; Springer and Douzery 1996Citation ). The different transition rates between A–G and C–T may reflect different deamination rates between C and A. The deamination rate of C estimated by Reyes et al. (1998)Citation and Tanaka and Ozawa (1994)Citation is twice as high as that of A. Because of the higher mutation rate from T to C, a greater abundance of C–T transitions than A–G transitions is expected.

The underlying selective pressure also contributes to different degrees of transitional-transversional bias between stem and loop regions. In stem regions, the base pairing under the A-to-G transition will be retained because both A:T and G:T pairings are allowed in the rRNA secondary structure. This is also true for the C-to-T transition, as the change of G:C to G:T can also be paired. On the other hand, any kind of transversion will destroy base pairing which is subjectively not selectively favored. In loop regions where base pairing is not required, the transition-transversion bias may only reflect the biased mutational pressure of the mtDNA genome.

Compared with its mammalian counterpart, lower proportions of A + T and higher proportions of G + C were found in paired regions of all fish taxa. The nucleotide frequencies of paired regions of mammalian 12S rRNA account only for 50.4%, which is 10% lower than that of fishes in general. The higher G + C composition in paired regions of rRNAs has been predicted based on free energy considerations because G:C pairs have a lower free energy value than G:T or A:T pairs (Turner, Sugimoto, and Freir 1988Citation ). Martin (1995)Citation found that acceleration in the rate of oxygen consumption is associated with an increase in A + T nucleotides. The lower G + C composition in mammalian 12S rRNA may be caused by higher oxygen metabolism in mammals than in fishes.

Comparisons of substitutions that maintain or destroy base pairings demonstrate the strength of selection pressure for compensatory substitutions along the evolutionary tree. Some positions within stems, however, are characterized by mispairings according to our secondary structure model. Two reasons could explain that. First of all, some sites under study could be of limited functional importance, allowing mispairings to persist. In stem 11, the terminal loop between 11 and 11' is not conserved, ranging from 3 to 11 bases in length. There are two mispairings occurring at position 11-5 (Gadus and Latimeria). This could be caused by the loss of functional constraints in this region. Second, substitutions in one side of the stem are probably not simultaneously compensated for on the other side. This could lead to a time lag in compensation of the mispairing as demonstrated by Kraus et al. (1992)Citation . In stem 9 of Cyprinus, position 9-3 is not paired, but the compensatory substitutions are otherwise observed in Squalus and Mustelus. This could be because of the time lag of the G-to-C transversion on one side, but compensatory mutation on the other side not yet occurring.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The aligned sequences of nine assorted fish species with their secondary structure information are shown in the supplementary material on the SMBE website (www.molbiolevol.org). The alignment file in MSWord format is also available upon request to the authors.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The authors express their sincere thanks to Dr. Chung-I Wu, University of Chicago for his helpful comments. This study was supported by the National Science Council, Taiwan to S.C.L. (NSC 89-2611-001-001).


    Footnotes
 
Ross Crozier, Reviewing Editor

Keywords: nucleotide composition deamination asymmetric replication transition-transversion bias Back

Address for correspondence and reprints: Sin-Che Lee, Laboratory of Molecular Systematics of Fishes, Institute of Zoology, Academia Sinica, Taipei 11529, Taiwan. sclee{at}gate.sinica.edu.tw . Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Asakawa S., Y. Kumazawa, T. Araki, H. Himeno, K. Miura, K. Watanabe, 1991 Strand-specific nucleotide composition bias in echinoderm and vertebrate mitochondrial genomes J. Mol. Evol 32:511-520[ISI][Medline]

    Cao Y., P. J. Waddell, N. Okada, M. Hasegawa, 1998 The complete mitochondrial DNA sequence of the shark Mustelus manazo: evaluating rooting contradictions to living bony vertebrates Mol. Biol. Evol 15:1637-1646[Abstract/Free Full Text]

    Chang Y. S., F. L. Huang, T. B. Lo, 1994 The complete nucleotide sequence and gene organization of carp (Cyprinus carpio) mitochondrial genome J. Mol. Evol 38:138-155[ISI][Medline]

    Dahlberg A. E., 1989 The functional role of ribosomal RNA in protein synthesis Cell 57:525-529[ISI][Medline]

    Dixon M. T., D. M. Hillis, 1993 Ribosomal RNA secondary structure: compensatory mutations and implications for phylogenetic analysis Mol. Biol. Evol 10:256-267[Abstract]

    Douzery E., F. M. Catzeflis, 1995 Molecular evolution of the mitochondrial 12S rRNA in Ungulata (Mammalia) J. Mol. Evol 41:622-636[ISI][Medline]

    Gatesy J., G. Amato, E. Vrba, G. Schaler, R. Desalle, 1997 A cladistic analysis of mitochondrial ribosomal DNA from Bovidae Mol. Phylogenet. Evol 7:303-319[ISI][Medline]

    Gutell R. R., 1994 Collection of small subunit (16S- and 16S-like) ribosomal RNA structures Nucleic Acids Res 22:3502-3507[Abstract]

    Gutell R. R., B. Weiser, C. Woese, H. F. Noller, 1985 Comparative anatomy of 16S-like ribosomal RNA Prog. Nucleic Acid Res. Mol. Biol 32:155-216[ISI][Medline]

    Hickson R. E., C. Simon, A. Cooper, G. S. Spicer, J. Sullivan, D. Penny, 1996 Conserved sequence motifs, alignment, and secondary structure for the third domain of animal 12S rRNA Mol. Biol. Evol 13:150-169[Abstract]

    Hoese D. F., A. G. Gill, 1993 Phylogenetic relationships of eleotridid fishes (Perciformes: Gobioidei) Bull. Mar. Sci 52:415-440[ISI]

    Johansen S., I. Bakke, 1996 The complete mitochondrial DNA sequence of Atlantic cod (Gadus morhua): relevance to taxonomic studies among codfishes Mol. Mar. Biol. Biotechnol 5:203-214[ISI][Medline]

    Jermiin L. S., D. Graur, R. H. Crozier, 1995 Evidence from analyses of intergenic regions for strand-specific directional mutation pressure in Metazoan mitochondrial DNA Mol. Biol. Evol 12:558-563[Free Full Text]

    Kimura M., 1980 A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences J. Mol. Evol 16:111-120[ISI][Medline]

    Kjer K. M., 1995 Use of rRNA secondary structure in phylogenetic studies to identify homologous positions: an example of alignment and data presentation from the frogs Mol. Phylogenet. Evol 4:314-330[ISI][Medline]

    Kraus F. L., M. Jarecki, M. S. Miyamoto, M. Tanhauser, P. J. Laipis, 1992 Mispairing and compensational changes during the evolution of mitochondrial ribosomal RNA Mol. Biol. Evol 9:770-774[Free Full Text]

    Kumar S., K. Tamura, M. Nei, 1993 MEGA: molecular evolutionary genetics analysis. Version 1.01 Pennsylvania State University

    Lavergne A., E. Douzery, T. Stichler, F. M. Catzeflis, M. S. Springer, 1996 Interordinal mammalian relationships: evidence for paenungulate monophyly is provided by complete mitochondrial 12S rRNA sequences Mol. Phylogenet. Evol 6:245-258[ISI][Medline]

    Ledje C., U. Arnason, 1996 Phylogenetic relationships within Caniform carnivores based on analysis of the mitochondrial 12S rRNA gene J. Mol. Evol 43:641-649[ISI][Medline]

    Maddison D. R., 1994 Phylogenetic methods for inferring the evolutionary history and processes of change in discretely valued characters Annu. Rev. Entomol 39:267-292[ISI]

    Maddison W. P., D. R. Maddison, 1992 MacClade: analysis of phylogeny and character evolution Sinauer Associates, Sunderland, Mass

    Martin A. P., 1995 Metabolic rate and directional nucleotide substitution in animal mitochondrial DNA Mol. Biol. Evol 12:1124-1131[Abstract]

    Montgelard C., F. M. Catzeflis, E. Douzery, 1997 Phylogenetic relationships of Artiodactyls and Cetaceans as deduced from the comparison of cytochrome b and 12S rRNA mitochondrial sequences Mol. Biol. Evol 14:550-559[Abstract]

    Morrison D. A., J. T. Ellis, 1997 Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of Apicomplexa Mol. Biol. Evol 14:428-441[Abstract]

    Mugridge N. B., D. A. Morrison, T. Jakel, A. R. Heckeroth, A. M. Tenter, A. M. Johnson, 2000 Effects of sequence alignment and structural domains of ribosomal DNA on phylogeny reconstruction for the protozoan family Sarcocystidae Mol. Biol. Evol 17:1842-1853[Abstract/Free Full Text]

    Noller H. F., 1984 Structure of ribosomal RNA Annu. Rev. Biochem 53:119-162[ISI][Medline]

    Ort G., P. Petry, J. I. R. Porto, M. Jégu, A. Meyer, 1996 Patterns of nucleotide change in mitochondrial ribosomal RNA genes and the phylogeny of piranhas J. Mol. Evol 42:169-182[ISI][Medline]

    Peer Y. V., I. V. Broeck, P. Rijk, R. Wachter, 1994 Database on the structure of small ribosomal subunit RNA Nucleic Acids Res 22:3488-3494[Abstract]

    Peer Y. V., P. Rijk, J. Wuyts, T. Winkelmans, R. Wachter, 2000 The European small subunit ribosomal RNA database Nucleic Acids Res 28:175-176[Abstract/Free Full Text]

    Perna N. T., T. D. Kocher, 1995 Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes J. Mol. Evol 41:353-358[ISI][Medline]

    Pezold F., 1993 Evidence for monophyletic Gobiinae Copeia 1993:634-643

    Rasmussen A. S., U. Arnason, 1999 Phylogenetic studies of complete mitochondrial DNA molecules place cartilaginous fishes within the tree of bony fishes J. Mol. Evol 48:118-123[ISI][Medline]

    Reyes A., C. Gissi, G. Pesole, C. Saccone, 1998 Asymmetrical directional mutation pressure in the mitochondrial genome of mammals Mol. Biol. Evol 15:957-966[Abstract]

    Richard C. M., W. S. Moore, 1996 A phylogeny for the African treefrog family Hyperoliidae based on mitochondrial rDNA Mol. Phylogenet. Evol 5:522-532[ISI][Medline]

    Rzhetsky A., M. Nei, 1992 A simple method for estimating and testing minimum-evolution trees Mol. Biol. Evol 9:945-967[Free Full Text]

    Saitou N., M. Nei, 1987 The neighbor-joining method: a new method for reconstructing phylogenetic trees Mol. Biol. Evol 4:406-425[Abstract]

    Simons A. M., R. L. Mayden, 1998 Phylogenetic relationships of the western North American phoxinins (Actinopterygii: Cyprinidae) as inferred from mitochondrial 12S and 16S ribosomal RNA sequences Mol. Phylogenet. Evol 9:308-329[ISI][Medline]

    Smith A. B., 1989 RNA sequence data in phylogenetic reconstruction: testing the limits of its resolution Cladistics 5:321-344[ISI]

    Springer M. S., L. J. Hollar, A. Burk, 1995 Compensatory substitutions and the evolution of the mitochondrial 12S rRNA gene in mammals Mol. Biol. Evol 12:1138-1150[Abstract]

    Springer M. S., E. Douzery, 1996 Secondary structure and patterns of evolution among mammalian mitochondrial 12S rRNA molecules J. Mol. Evol 43:357-373[ISI][Medline]

    Stephan W., D. A. Kirby, 1993 RNA folding in Drosophila shows a distance effect for compensatory fitness interactions Genetics 135:97-103[Abstract/Free Full Text]

    Swofford D. L., 1999 PAUP: phylogenetic analysis using parsimony. Version 4.04 Sinauer Associates, Sunderland, Mass

    Tanaka M., T. Ozawa, 1994 Strand asymmetry in human mitochondrial DNA mutations Genomics 22:327-335[ISI][Medline]

    Thompson J. D., D. G. Higgins, T. J. Gibson, 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice Nucleic Acids Res 22:4673-4680[Abstract]

    Tillier E. R. M., R. A. Collins, 1995 Neighbor joining and maximum likelihood method with RNA sequences: addressing the interdependence of sites Mol. Biol. Evol 12:7-15[Free Full Text]

    Turner D. H., N. Sugimoto, S. M. Freier, 1988 RNA structure prediction Annu. Rev. Biophys. Chem 17:167-192[ISI][Medline]

    Tzeng C. S., C. F. Hui, S. C. Shen, P. C. Huang, 1992 The complete nucleotide sequence of the Crossostoma lacustre mitochondrial genome: conservation and variations among vertebrates Nucleic Acids Res 20:4853-4858[Abstract]

    Vawter L., W. M. Brown, 1993 Rates and patterns of base change in the small subunit ribosomal RNA gene Genetics 134:597-608[Abstract/Free Full Text]

    Wang H. Y., M. P. Tsai, J. Dean, S. C. Lee, 2001 Molecular phylogenetic relationships of gobioid fishes based on analysis of the mitochondrial 12S rRNA Sequences Mol. Phylogenet. Evol 20:390-408[ISI][Medline]

    Wheeler W. C., R. L. Honeycutt, 1988 Paired sequence difference in ribosomal RNAs: evolution and phylogenetic implications Mol. Biol. Evol 5:90-96[Abstract]

    Wu C.-I., N. Maeda, 1987 Inequality in mutation rates of the two strands of DNA Nature 327:169-170[ISI][Medline]

    Zardoya R., A. Garrido-Pertierra, J. M. Bautista, 1995 The complete nucleotide sequence of the mitochondrial DNA genome of the rainbow trout, Oncorhynchus mykiss J. Mol. Evol 41:942-951[ISI][Medline]

    Zardoya R., A. Meyer, 1997 The complete DNA sequence of the mitochondrial genome of a ‘living fossil,’ the coelacanth (Latimeria chalumnae) Genetics 146:995-1,010[Abstract/Free Full Text]

    Zuker M., 1989 Computer prediction of RNA structure Pp. 262–288 in J. E. Dahlberg and J. N. Abelson, eds. Methods in Enzymology, Vol. 180. Academic Press, San Diego, Calif

Accepted for publication August 20, 2001.





This Article
Abstract
FREE Full Text (PDF)
Supplementary material
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (3)
Request Permissions
Google Scholar
Articles by Wang, H.-Y.
Articles by Lee, S.-C.
PubMed
PubMed Citation
Articles by Wang, H.-Y.
Articles by Lee, S.-C.