A Re-evaluation of 12S Ribosomal RNA Variability in Drosophila pseudoobscura

Mohamed A. F. Noor2, and John C. Larkin

Department of Biological Sciences, Louisiana State University


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 literature cited
 
Two recent studies have presented conflicting views on variation present within the 294 base third domain of the 12S rRNA gene in the genus Drosophila, and in D. pseudoobscura in particular. One study suggested that this gene is highly invariant across the genus, while another recovered 22 distinct haplotypes from 22 strains of D. pseudoobscura. We have sequenced this gene in numerous lines of D. pseudoobscura and its relatives, noting only two haplotypes in the third domain, and we failed to confirm any of the published sequences. Second, we note that the published sequence divergence between strains of D. pseudoobscura was as great as that documented between distantly related Drosophila species. Third, we show that the published polymorphisms of this region within D. pseudoobscura would disrupt the secondary structure of the resulting molecule. We conclude that the published 12S rRNA sequences of D. pseudoobscura do not accurately reflect variability of the functional gene, and that this gene is relatively invariant in D. pseudoobscura and D. persimilis.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 literature cited
 
The mitochondrial 12S ribosomal RNA gene has been sequenced and analyzed extensively in a wide variety of animal taxa. In particular, the slowly evolving third domain of this gene has received the most attention, as it has proven useful in systematic studies of families and genera (Palumbi 1996Citation ). However, two recent studies have provided conflicting results concerning the variability of this gene region in the genus Drosophila. Specifically, Jenkins, Basten, and Anderson (1996)Citation studied nucleotide diversity within one Drosophila species, D. pseudoobscura, and documented more than 30 variable sites within 400 bp of sequence, as well as several divergent sites between subspecies and sibling species. In contrast, Simon et al. (1996)Citation suggested that this region evolved very slowly within the genus Drosophila, and they identified very little divergence between very distantly related species. Furthermore, in the course of population genetics studies in D. pseudoobscura, we found much less sequence diversity at this locus than was reported by Jenkins, Basten, and Anderson (1996)Citation .

These inconsistencies prompted us to reexamine the diversity of the third domain of the 12S rRNA gene within D. pseudoobscura. First, we present results from our own sequencing efforts of this region in D. pseudoobscura and its sibling species, Drosophila persimilis. Second, we directly compare the variability of this region using our data and the data obtained from the two studies above. Finally, we examine how the apparent polymorphisms of this region within the genus would affect the secondary structure of the resulting molecule based on two published models, and we use this analysis to test the validity of the conflicting views of sequence diversity in this region.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 literature cited
 
We sequenced the third domain of 12S rRNA in 1 line of Drosophila subobscura, 1 line of Drosophila willistoni, 15 lines of Drosophila pseudoobscura pseudoobscura, 4 lines of Drosophila pseudoobscura bogotana, and 5 lines of D. persimilis using primers designed from the sequences in the Jenkins, Basten, and Anderson (1996)Citation study (F: 5'-GTGAATTATTTAAGTTAAAAGA-3'; R: 5'-CCTTTCACATAGATCTTACTG-3'). Five samples (three D. p. bogotana, one D. p. pseudoobscura, and one D. persimilis) were also amplified and sequenced using primers DRMT1653S and DRMT2279N from the Jenkins, Basten, and Anderson (1996)Citation study, but these sequences did not differ from the ones we obtained using the above primers. The D. subobscura fly we sequenced was a wild-caught fly captured in Mount St. Helena, Calif. The D. willistoni individual sequenced was from the L'Habitatue line provided by J. Gleason. Drosophila pseudoobscura pseudoobscura lines surveyed were from Flagstaff, Ariz. (3), Mount St. Helena, Calif. (2), Mather, Calif. (3), Puebla, Mexico (1), Goldendale, Wash. (3), Cheney, Wash. (2), and Lincoln, New Zealand (1). The D. p. bogotana lines surveyed were lines 35, 68, 71, and ER from the Jenkins, Basten, and Anderson (1996)Citation study, provided by W. Anderson. The D. persimilis lines sequenced were from Mount St. Helena, Calif. (3), and Mather, Calif. (2).

DNA was extracted from single flies using squish preparations (Gloor et al. 1993Citation ). PCR was then performed as described previously (Noor, Schug, and Aquadro 2000), with an annealing temperature of 55°C. PCR products were cleaned using Qiagen's PCR purification kit, and sequencing followed the protocol of ABI's BigDye terminator kit. Sequences were run on an ABI 377 at the Museum of Natural Science at Louisiana State University. Sequence analysis was confined to 294 bases composing the third domain of the 12S rRNA gene. Sequences were aligned using CLUSTAL W. Numbering of the bases in the sequence in the text are given relative to the Drosophila yakuba sequence, with the first base taken as the first base of the D. yakuba 12S rRNA third domain (Clary and Wolstenholme 1985Citation ).

Differences between aligned sequences were counted manually. We used the following sequences from the Jenkins, Basten, and Anderson (1996)Citation study for this analysis: AH162, ARS, SB33, BC, BOG3, BOG4, BOG71, BOG68, MEX89, PSU246, PSU361, TEX86, BOG69, PSU360, MEX34, TEX78, BOG35, MEX32, and PSU242. We excluded some sequences of Jenkins, Basten, and Anderson (1996)Citation that were recently found to have potential documentation errors (T. Jenkins, personal communication).

We took several measures to ensure that these lines and reactions were not contaminated. First, all of the DNA preparations except for those from the D. p. bogotana lines have been used extensively in various microsatellite genotyping efforts (Noor, Schug, and Aquadro 2000), and none of them possessed identical genotypes across all loci studied. Many of these preparations are from wild-caught flies or F1 flies from nature, decreasing the probability that laboratory contamination could have occurred. Second, for each set of PCR and sequencing reactions, we simultaneously analyzed a D. subobscura individual. This individual consistently had a sequence distinct from all of our other samples, suggesting that our supplies and reagents were not contaminated with DNA from D. pseudoobscura. Its sequence differed from the published D. subobscura sequence for this region (accession number AF126307) at only one base. All sequence electropherograms were read and confirmed by two individuals.

The inbred strains of D. p. bogotana that we sequenced were also used in the Jenkins, Basten, and Anderson (1996)Citation study. To rule out the possibility of contamination in these stocks during the intervening years, we genotyped these strains for a variable microsatellite (DPS4001) using the protocol of Noor, Schug, and Aquadro (2000). Among the four homozygous strains, we found three different alleles at this locus, suggesting that there was little or no cross-contamination or contamination from any single line.

For the analysis of the location of mutations relative to secondary-structure elements in the RNA, paired regions in the secondary-structure model of Clary and Wolstenholme (1985)Citation were located on the alignment. Phylogenetically conserved bases identified by Hickson et al. (1996)Citation that are present in D. yakuba were also identified. Base substitutions were considered unfavorable if they resulted in a base pair other than A-U, G-C, or G-U in a base-paired region in the Clary and Wolstenholme (1985)Citation model, or if they changed a phylogenetically conserved base that was present in D. yakuba.

Thermodynamic stability of 12S rRNA subdomain structures was examined using the efn server at http://www.ibc.wustl.edu/~zuker/rna/, using the mfold 3.0 parameters (Mathews et al. 1999Citation ). Sequences were forced to fold according to a particular secondary-structure model, and the program determined the free energy ({Delta}G) of the resulting structure. A minimum {Delta}G structure for the domain of D. yakuba 12S rRNA containing helices 38–42 was determined using mfold 3.0.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 literature cited
 
Sequencing Results
Within the third domain of the 12S rRNA gene, we found no polymorphic sites within D. p. pseudoobscura or D. persimilis, and no difference between these species. One of the five D. p. bogotana strains that we surveyed possessed an additional haplotype, differing by one base from the D. p. pseudoobscura sequence. Surprisingly, none of our sequences were identical to any of the sequences reported by Jenkins, Basten, and Anderson (1996, and see below)Citation . One additional polymorphic site was noted outside the surveyed third domain of 12S rRNA but included in our GenBank submissions. Also, our sequences from the D. subobscura strain were always identical to each other and differed at only one base from the published sequence of 12S rRNA in this species (accession number 126307). This single difference was at a nonconserved site (see Secondary-Structure Analysis, below).

Comparison Between Intraspecific and Interspecific Variation
We have also compared the differences observed within D. pseudoobscura in the Jenkins, Basten, and Anderson (1996)Citation study and those observed between species in the Simon et al. (1996)Citation study. We expected that there would be substantially fewer differences between intraspecific lines than between interspecific pairs. This contrast should be most apparent in the interspecific pairs involving D. virilis, which is estimated to be approximately 60 Myr divergent from the other Drosophila species examined (Powell 1997Citation , p. 283).

As reported by Simon et al. (1996)Citation , we noted very little differentiation between Drosophila species in the third domain of the 12S rRNA locus; species possessed differences at only 3–17 sites. The only observed insertion/deletion was between the D. willistoni sequence of Simon et al. (1996)Citation and all of the other species, and this indel was not noted in our sequence of this region in a different line. The largest difference between species was between D. melanogaster and D. willistoni, which differed from one another by 17 base pair substitutions, not including the indel. Drosophila virilis and D. melanogaster differed at 11 bp.

Relative to our predominant D. pseudoobscura sequence, the 20 sequences reported by Jenkins, Basten, and Anderson (1996)Citation differed at 1–10 sites (mean = 5.21), including several insertions and deletions. No two sequences from this study were identical. We focused our next comparison on Jenkins, Basten, and Anderson's (1996)Citation BC sequence of D. pseudoobscura because it differed from our sequence by 6 bp, which is close to the mean difference from among their strains and because the original autoradiograph was recently confirmed for accuracy (T. Jenkins, personal communication). Relative to the BC strain, the remaining 19 surveyed sequences presented by Jenkins, Basten, and Anderson (1996)Citation differed at 6–16 sites (mean = 8.78). Hence, the differences among the sequences of Jenkins, Basten, and Anderson (1996)Citation for D. pseudoobscura were on the same order as those observed among distantly related Drosophila species in the survey of Simon et al. (1996)Citation .

Relative to the published sequence of D. yakuba, the D. pseudoobscura sequences of Jenkins, Basten, and Anderson (1996)Citation differed at 5–14 bp. In contrast, our primary D. pseudoobscura sequence differed from the D. yakuba sequence by only 4 bp, and the unique allele among our D. p. bogotana sequences differed from the D. yakuba sequence by only 5 bp.

Secondary-Structure Analysis
The high level of within-species variation of 12S rRNA in the data of Jenkins, Basten, and Anderson (1996)Citation was surprising, especially in light of the low variation observed across the genus. If this represented real variation in a functional copy of the 12S rRNA gene, then the pattern of substitutions should reflect natural selection occurring on functional constraints of the sequence, such as base-pairing and highly conserved sequence motifs (Hickson et al. 1996Citation ). First, we examined all different substitutions between D. yakuba and the six other Drosophila species surveyed that either disrupt base-pairing in the secondary structure proposed by Clary and Wolstenholme (1985)Citation or change phylogenetically conserved bases described by Hickson et al. (1996)Citation that are present in the D. yakuba sequence. Of 25 distinct substitutions between these species and D. yakuba, four differences (16.0%) that would disrupt either predicted stems or conserved bases were found. In contrast, when we considered all differences between the D. yakuba sequence and the data of Jenkins, Basten, and Anderson (1996)Citation , 20 of 45 differences (44.4%) would disrupt either predicted stems or conserved bases. The fraction of differences disrupting structurally important sequences in the latter comparison was significantly greater than that seen in the between-species comparison (Fisher's exact test; P = 0.019). Based on these observations, it seems unlikely that the sequences reported by Jenkins, Basten, and Anderson (1996)Citation represent functional 12S rRNA genes. None of the differences observed between D. yakuba and the new D. pseudoobscura and D. persimilis sequences reported here change conserved bases or disrupt pairing of bases.

As a second test, we compared the distributions of all sites differing from the D. yakuba srRNA sequence among species in the Simon et al. (1996)Citation and our study and among strains in the Jenkins, Basten, and Anderson (1996)Citation study across conserved sites and sites in stems (hereinafter referred to jointly as "conserved sites") vs. nonconserved sites. We predict that observed differences will be significantly clustered in the nonconserved sites. In the 294-base region that we examined, 186 bases are in conserved sites and 108 are not. Jenkins, Basten, and Anderson (1996)Citation identified 40 differences, of which 19 occurred in conserved sites. This was not significantly different from random distribution of differences ({chi}2 = 3.7, P > 0.05). In contrast, our work and that of Simon et al. (1996)Citation identified 24 differences, but only 9 of these differences were in conserved sites. These differences were significantly clustered in nonconserved bases ({chi}2 = 6.2, P = 0.0127). The differences relative to D. yakuba among species and among strains did not differ from each other significantly ({chi}2 = 0.6), although this may result from the small number of differences in both data sets.

As a third test, the predicted stability of the folded rRNA structure for a portion of the molecule was compared using the programs of Mathews et al. (1999)Citation . The region from base 85 to base 164, containing helices 38–42 (Hickson et al. 1996Citation ), was chosen for comparison because its structure is highly conserved phylogenetically, and many differences were found in this region between D. yakuba and the data of Jenkins, Basten, and Anderson (1996)Citation . The predicted stability of nine sequences from Jenkins, Basten, and Anderson (1996)Citation with base pair differences in this region were compared with the predicted stability of this region in nine Drosophila species. Sequences were constrained to fold into the proposed D. yakuba secondary structure (Clary and Wolstenholme 1985Citation ), and the free energy of the structure was determined. Among the different species of Drosophila, the most stable structure was that of D. yakuba ({Delta}G = +2.5 kcal/mol), and the least stable was that of D. subsilvestris ({Delta}G = +4.7 kcal/mol), with a mean of {Delta}G = +2.9 ± 0.7 kcal/mol. Of the nine sequences from Jenkins, Basten, and Anderson (1996)Citation that differed from D. yakuba in this region, the most stable structure was that of BC ({Delta}G = +4.8 kcal/mol), and the least stable was that of TEX86 ({Delta}G = +12.4 kcal/mol), with a mean of {Delta}G = +8.9 ± 2.9 kcal/mol. Thus, all sequences of Jenkins, Basten, and Anderson (1996)Citation that differed from D. yakuba in this region would be less stable than the sequences of other Drosophila species when folded into this secondary structure.

Two other potential secondary structures for this region were also considered, a minimal-energy structure predicted by the mfold program (Mathews et al. 1999Citation ), and a structure based on the structures of Hickson et al. (1996)Citation with a symmetrical unpaired loop at the site of the conserved, unpaired CAA (bases 56–58). For the minimal-energy structure, {Delta}G for the various Drosophila species ranged from -10.4 kcal/mol to -6.2 kcal/mol, with a mean of {Delta}G = -9.5 ± 1.6 kcal/mol. With this structure, the Jenkins, Basten, and Anderson (1996)Citation sequences that differed from D. yakuba had {Delta}G values that ranged from -5.5 kcal/mol to +8.7 kcal/mol, with a mean of {Delta}G = +1.1 ± 5.2 kcal/mol.

For the structure based on Hickson et al. (1996)Citation , {Delta}G for the various Drosophila species ranged from -10.0 to -5.8 kcal/mol, with a mean of {Delta}G = -9.1 ± 1.6 kcal/mol. With this structure, the Jenkins, Basten, and Anderson (1996)Citation sequences that differed from D. yakuba in this region had {Delta}G values that ranged from -17.4 to +13.2 kcal/mol, with a mean of {Delta}G = -3.3 ± 9.0 kcal/mol. For this last structure, three of the nine Jenkins, Basten, and Anderson (1996)Citation sequences had {Delta}G values suggesting that the structure was at least as stable as the least stable Drosophila species, while the remaining six sequences had structures predicted to be less stable than those of any of the other species. In summary, for some or all three of the secondary structures considered, most sequences from the Jenkins, Basten, and Anderson (1996)Citation study were predicted to be much less stable than those for any of the Drosophila species considered.


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 literature cited
 
Several lines of evidence indicate that the low degree of sequence polymorphism found in this study is more representative of the true variation in the third domain of the 12S rRNA locus in D. pseudoobscura than the extremely high level of polymorphism reported by Jenkins, Basten, and Anderson (1996)Citation . First, we were unable to confirm any of the published sequences, and we found only a single variable site in this region. Second, the divergence reported by Jenkins, Basten, and Anderson (1996)Citation among D. pseudoobscura lines was as great as or greater than that observed between distantly related species in the genus Drosophila. Our observation of only limited variation is more consistent with what one might expect for such a slow-evolving gene (Simon et al. 1996Citation ). Third, polymorphisms reported by Jenkins, Basten, and Anderson (1996)Citation would be expected to severely disrupt the 12S rRNA secondary structure. In contrast, very few of the polymorphisms differentiating the 12S rRNA of nine different Drosophila species, and none of the differences we found between D. yakuba and D. pseudoobscura, would adversely affect 12S rRNA secondary structure. Hence, we conclude that the third domain of the 12S rRNA locus in D. pseudoobscura is relatively invariant.

We were unable to amplify any D. pseudoobscura sequences that were identical to those presented by Jenkins, Basten, and Anderson (1996)Citation , even when using both the same primers and lines as closely related to theirs as possible. One possible explanation is that this locus was transposed to the nuclear genome and currently exists as an unconstrained pseudogene. This possibility could explain the high level of variability in the Jenkins, Basten, and Anderson (1996)Citation study. Evidence for this has been noted in 12S rRNA sequences in various insects and vertebrates (e.g., van der Kuyl et al. 1995Citation ; Zhang and Hewitt 1996Citation ). We cannot rule out the possibility that some slight difference in amplification conditions may have caused us to amplify the mitochondrial 12S sequence and Jenkins, Basten, and Anderson (1996)Citation to amplify its nuclear counterpart, even when using the same primers. However, our study provides no direct evidence for the existence of such a pseudogene. Other possibilities, such as misincorporation of nucleotides during PCR, may also explain the discrepancy between our results and those of Jenkins, Basten, and Anderson (1996)Citation .

This study underscores the utility of examining novel sequences in both phylogenetic and functional contexts, if possible (Hickson et al. 1996Citation ). Incorporating the broadest possible phylogenetic context can allow one to compare rates of intraspecific and interspecific evolutionary changes. Such a comparison can increase the breadth of the questions being addressed in any individual study or identify potentially problematic sequences. Furthermore, as noted by Hickson et al. (1996)Citation , considering the functional constraints on molecules also offers the potential for identifying patterns of molecular evolution or problematic sequences.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 literature cited
 
The nucleotide sequences new to this paper are available from the EMBL/GenBank databases under accession numbers AF220070AF220095. The alignment of these sequences will be made available on the World-Wide Web for a minimum of 2 years after publication from http://www.biology.lsu.edu/webfac/mnoor/align.doc and will be available from the authors thereafter.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 literature cited
 
We thank W. Anderson, M. Hellberg, T. Jenkins, and two anonymous reviewers for useful comments and discussions in the course of preparing this manuscript. M.A.F.N. was funded by NIH grant GM58060 (subcontracted through J. Hey at Rutgers University) and J.C.L. was funded by NSF grant IBN9728047.


    Footnotes
 
Axel Meyer, Reviewing Editor

1 Keywords: Drosophila pseudoobscura, 12S ribosomal RNA small ribosomal RNA sequence variation RNA secondary structure Back

2 Address for correspondence and reprints: Mohamed A. F. Noor, Department of Biological Sciences, Life Sciences Building, Louisiana State University, Baton Rouge, Louisiana 70803. E-mail: mnoor{at}lsu.edu . Back


    literature cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 literature cited
 

    Clary, D. O., and D. R. Wolstenholme. 1985. The mitochondrial DNA molecular of Drosophila yakuba: nucleotide sequence, gene organization, and genetic code. J. Mol. Evol. 22:252–271.[ISI][Medline]

    Gloor, G. B., C. R. Preston, D. M. Johnson-Schlitz, N. A. Nassif, R. W. Phillis, W. K. Benz, H. M. Robertson, and W. R. Engels. 1993. Type I repressors of P element mobility. Genetics 135:81–95.

    Hickson, R. E., C. Simon, A. Cooper, G. S. Spicer, J. Sullivan, and D. Penny. 1996. Conserved sequence motifs, alignment, and secondary structure for the third domain of animal 12S rRNA. Mol. Biol. Evol. 13:150–169.[Abstract]

    Jenkins, T. M., C. J. Basten, and W. W. Anderson. 1996. Mitochondrial gene divergence of Colombian Drosophila pseudoobscura. Mol. Biol. Evol. 13:1266–1275.[Abstract]

    Mathews, D. H., J. Sabina, M. Zuker, and D. H. Turner. 1999. Expanded sequence dependence of thermodynamic parameters provides robust prediction of RNA secondary structure. J. Mol. Biol. 288:911–940.[ISI][Medline]

    Noor, M. A. F., M. D. Schug, and C. F. Aquadro. 2000. Microsatellite variation in populations of Drosophila pseudoobscura and Drosophila persimilis. Genet. Res. Camb. 75:25–35.

    Palumbi, S. R. 1996. Nucleic acids II: the polymerase chain reaction. Pp. 205–247 in D. M. Hillis, C. Moritz, and B. K. Mable, eds. Molecular systematics. Sinauer, Sunderland, Mass.

    Powell, J. R. 1997. Progress and prospects in evolutionary biology: the Drosophila model. Oxford University Press, New York.

    Simon, C., L. Nigro, J. Sullivan, K. Holsinger, A. Martin, A. Grapputo, A. Franke, and C. McIntosh. 1996. Large differences in substitutional pattern and evolutionary rate of 12S ribosomal RNA genes. Mol. Biol. Evol. 13:923–932.[Abstract/Free Full Text]

    van der Kuyl, A. C., C. L. Kuiken, J. T. Dekker, W. R. K. Perizonius, and J. Goudsmit. 1995. Nuclear counterparts of the cytoplasmic mitochondrial 12S rRNA gene: a problem of ancient DNA and molecular phylogenies. J. Mol. Evol. 40:652–657.[ISI][Medline]

    Zhang, D.-X., and G. M. Hewitt. 1996. Nuclear integrations: challenges for mitochondrial DNA markers. Trends Ecol. Evol. 11:247–251.[ISI]

Accepted for publication February 25, 2000.