Institute of Cell, Animal and Population Biology, University of Edinburgh, United Kingdom
Correspondence: E-mail: xulio.maside{at}ed.ac.uk.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: transposable elements horizontal transfer S-element Drosophila melanogaster
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Class II elements show characteristic terminal inverted repeats (TIRs), flanking a gene encoding a transposase. One of the most representative groups, the Tc1/mariner superfamily, includes TE families found in the genomes of a wide variety of organisms, including nematodes, insects, and vertebrates (Plasterk, Izsvak, and Ivics 1999). It has been proposed that the ultimate fate of these intragenomic parasites is to be silenced by the host, which would lead to their elimination from the genome (Hartl et al. 1997). However, the broad taxonomic distribution of this superfamily suggests that they have properties that allow them to persist.
One means of avoiding vertical loss is to jump from one species to another (horizontal transfer) (Hartl et al. 1997). The spread of the P-element family among Drosophila species is probably the best-documented example of such events in nature (Kidwell 1992; Pinsker et al. 2001). However, although the potential for horizontal transfer has been demonstrated in the laboratory (Plasterk, Izsvak, and Ivics 1999), direct evidence for horizontal transfer of other eukaryotic TEs in natural populations is scarce.
The S-element family is a member of the Tcl/Mariner superfamily and is ubiquitous in Drosophila melanogaster populations (Merriman et al. 1995). Interestingly, Southern blotting does not provide convincing evidence for S elements in any of 19 other Drosophila species surveyed (Merriman et al. 1995), including other members of the D. melanogaster subgroup (D. mauritian and D. simulans). This suggests that S elements could have invaded D. melanogaster by horizontal transfer. Here we report a study of the molecular evolution of the S-element family in D. melanogaster that supports this hypothesis.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
DNA Polymorphism Analysis
The high quality of the Release 3 sequence (Celniker et al. 2002) means that the probability of nucleotide differences among elements due to sequencing errors is negligible. DNA sequences were aligned with ClustalX 1.8 (Thompson et al. 1997) and adjusted manually. Nucleotide variation was analyzed with DnaSP 3.97 beta (Rozas 1999) and Proseq 2.9 beta (provided by D. Filatov). Maximum-likelihood estimates of the genetic distances between the elements were obtained with PAML 3.13a (Yang 1997) and with Mega 2.1 (Kumar et al. 2001), and phylogenetic relations were analyzed with Mega 2.1.
To test whether the sample of 14 element sequences used in this study represented the patterns of variation in the S-element family as a whole, we ran similar analyses on various sets of increasingly larger numbers of sequences that covered shorter fractions of the element consensus sequence. The results were always consistent with those described below. The estimates of within-group diversity (see below) were usually slightly higher, due to higher variance among pairwise estimates associated with the use of shorter sequences (data not shown).
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
In situ data suggest that S elements are found in different locations in different strains, consistent with the low frequencies of typical insertions demanded by the model (Merriman et al. 1995). In addition, the mean copy number for this family is 37.4 from a survey of 10 populations (Merriman et al. 1995). Because in situ estimates include fragmented copies as well as full-length element copies, this is likely to be an overestimate of the true number of intact copies. Even if we only use the 14 full-length elements found in the sequenced genome, the expected diversity () would be approximately 0.20, which is far larger than the observed value (see above).
We also have evidence that the distribution of frequencies of variants among the sampled sequences departs from simple expectations. Under the null hypothesis of neutrality, together with an equilibrium copy number distribution and low element frequencies, the genealogy of a set of elements should approximate that given by the standard exponential distribution of time to coalescence for ordinary genes, with an expected time to coalescence of twice the above effective population size (Hudson and Kaplan 1986). This result is also true in the presence of gene conversion among members of the element family, with an appropriate correction to the expression for effective population size (Hudson and Kaplan 1986).
Tests for departure from these assumptions can thus be conducted by the standard methods of molecular population genetics such as Tajima's D statistic (Tajima 1989). We have examined such departures for mutations at silent sites (noncoding and synonymous sites) in our dataset. The resulting value of the Tajima test (Tajima's D statistic = 1.565, P < 0.05; assuming no recombination) indicates a significant excess of low-frequency variants over what is expected on the null hypothesis of neutrality and a genealogy generated by a standard coalescent process. This is also observed for the noncoding DNA (TIRs and sequences flanking the transposase gene) for which Tajima's D statistic = 1.626 (P < 0.05).
The reduced variation at these sites, as compared with the synonymous sites of the transposase gene (table 1), is consistent with the idea that these noncoding sequences play an important role in transposition (Plasterk, Izsvak, and Ivics 1999) and with the fact that some of these copies are actively transposing (Merriman et al. 1995). The significant departures of the variant frequencies from neutrality at both putatively neutral and selected sites might reflect the effects of selection at closely linked sites (Tachida 2000; McVean and Charlesworth, unpublished data). However, given that such departures are rarely seen for nuclear genes in D. melanogaster populations (Andolfatto and Przeworski 2001), there is no reason to expect them to be found for transposable elements. The same argument applies to the much lower than expected diversity for the element family. It seems more likely that these effects are a consequence of the demographic history of the elements (see below).
As mentioned above, the S element has only been definitely identified in D. melanogaster, despite attempts to find it in 19 other Drosophila species (Merriman et al. 1995). This is in agreement with data from in situ hybridization on polytene chromosomes from various D. simulans isofemale lines (Maside, unpublished data) and leaves two alternative hypotheses to explain the presence of the S-element family in D. melanogaster alone: (1) the S-element family was present in a common ancestor of the D. melanogaster species group but has been systematically lost in all closely related species in a series of independent events, and (2) the S element invaded the genome of the ancestor of D. melanogaster after it diverged from other lineages.
The closest known relative of the S element is Paris (Tu and Shao 2002), a Tc1-like element found in Drosophila virilis. This species belongs to the Drosophila subgenus, which diverged from the Sophophora subgenus, where D. melanogaster is included, approximately 40 MYA (Russo, Takezaki, and Nei 1995). The first hypothesis thus implies the loss of the S-element family from hundreds of related Drosophila species and can be discarded on grounds of parsimony.
On the other hand, the phylogeny of the 14 members of the S-element family represented in the published D. melanogaster genome sequence, which correspond to the actual element insertions in the y[1]; cn[1] bw[1] sp[1] strain (see http://www.fruitfly.org/annot/release3.html for details), can be utilized to make some inferences regarding the evolutionary history of the family in this species (see fig. 1 in Supplementary Material online). The sequences are clustered into two genetically highly differentiated groups, (FST= 0.527, P < 0.05 in a permutation test with 5,000 replicates [Hudson, Boos, and Kaplan 1992]). The mean pairwise synonymous divergence between the two groups is 0.154 ± 0.024. Given that any ongoing gene conversion among copies (see below) tends to reduce the mean divergence between a pair of sequences (Hudson and Kaplan 1986), this estimate should be considered as a lower bound to the real divergence between the two clusters.
This value is, however, significantly larger than two independent estimates of the mean synonymous divergence between single-copy genes from D. melanogaster and D. simulans, 0.088 ± 0.0073 and 0.112 ± 0.004 (mean ± SE [Takano 1998; Betancourt, Presgraves, and Swanson 2002]), respectively (P < 0.02 in both two-tailed t-tests). If the rate of evolution of the element sequences is similar to that typical of the host genome, as seems reasonable since the elements are scattered fairly randomly over the genome, this suggests that the split within the S-element family preceded the split the ancestor of D. melanogaster from the ancestor of its siblings, D. simulans, D. sechellia, and D. mauritiana (Hey and Kliman 1993) around 2.5 to 3.5 MYA. This scenario is compatible with two hypotheses: (1) the S element invaded the ancestor of the melanogaster complex, but was subsequently lost in the lineage that gave rise to D. simulans, D. sechellia, and D. mauritiana, and (2) two different S-element subfamilies arrived independently in the D. melanogaster lineage after its separation, representing two separate horizontal transfer events. In the latter case, genetic differentiation between the two subfamilies would be expected to cause a haplotypic structure in a combined sample that departs significantly from that expected under the standard coalescent process. Fu's F test (Fu 1997) and Wall's B and Q tests (Wall 1999) for haplotypes failed to detect a significant departure from neutral expectations in our data set. This is consistent with a single invasion followed by mutational differentiation of the sequences. The departure from neutrality of variant frequencies and the reduced total variability suggest that there was a fairly long period when copy numbers were low, followed by the expansion of copy numbers to their present levels. If the invasion occurred before the divergence of the D. melanogaster lineage from the lineage leading to its sibling species, the initially low copy number might have led to stochastic loss of the elements from the latter (Charlesworth 1985).
It should also be pointed out that the tnpase gene of the S element has a very low level of codon usage bias, ENC = 58.4 (Wright 1990, estimated from the reference sequence, U33463), a value typical of the lowest biased single-copy genes of D. melanogaster or D. simulans (Powell and Moriyama 1997). Given that the rate of divergence at synonymous sites between many Drosophila species is negatively associated with codon usage bias (Powell and Moriyama 1997), it is interesting to note that the level of synonymous divergence between the two clusters of S elements is about the same as that observed among single-copy genes of D. melanogaster and D. simulans of similar ENC values (see fig. 5 in Powell and Moriyama 1997). Assuming that the evolutionary forces responsible for the above association operate on TEs in the same way as on single-copy genes, this means that the split between the two groups of elements could have occurred in the same period as the split between the two species.
We have also examined the possibility of genetic exchange among the elements by means of ongoing gene conversion between copies (Ohta 1985; Charlesworth 1986; Hudson and Kaplan 1986). We applied the four-alleles test (Hudson and Kaplan 1985) to search for recombination events among the sequences in our sample. The test is based on the detection of pairs of variable sites for which the four possible alleles resulting from the combination of the variants at each site are present in the sample. In this way, we detected a minimum of 28 recombination events among the 14 sequences in our data set (fig. 1). Even if we assume that some of these positive results could be caused by other factors, such as the existence of hypervariable sites, it seems unlikely that this could entirely explain the observations. First, the 28 recombination events suggested by the test are evenly distributed along the element sequence. Second, the method of Betrán et al. (1997) allowed us to identify four sequence tracts that have been exchanged between copies of the two groups.
|
In summary, our analysis of the nucleotide variation of the members of the S-element family in a sample genome of D. melanogaster has allowed us to reconstruct the evolutionary history of this family. It seems likely that the S-element family invaded the genome of the ancestor of D. melanogaster at about the same time as the split of the proto-simulans lineage. The family remained at low copy number in the D. melanogaster lineage for a prolonged period of time, although the ancestors of the two groups of sequences identified in the sample (see Supplementary Material online) must have been present during this period, given their high level of divergence. Such low copy numbers would probably have reduced the rate of gene conversion between different elements. The present level of copy number has been reached only some considerable time after the initial invasion, although it is difficult to be very precise about this. This interpretation is consistent with the relatively low levels of diversity among element sequences and the significant excess of low frequency variants at synonymous sites in the samples.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403-410.[CrossRef][ISI][Medline]
Andolfatto, P. 2001. Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans. Mol. Biol. Evol. 18:279-290.
Andolfatto, P., and M. Przeworski. 2001. Regions of lower crossing over harbor more rare variants in African populations of Drosophila melanogaster. Genetics. 158:657-665.
Berg, D. E., and M. M. Howe. 1989. Mobile DNA. American Society for Microbiology, Washington DC.
Betancourt, A. J., D. C. Presgraves, and W. J. Swanson. 2002. A test for faster X evolution in Drosophila. Mol. Biol. Evol. 19:1816-1819.
Betrán, E., J. Rozas, A. Navarro, and A. Barbadilla. 1997. The estimation of the number and the length distribution of gene conversion tracts from population DNA sequence data. Genetics. 146:89-99.
Brookfield, J. F. 1986. A model for DNA sequence evolution within transposable element families. Genetics. 112:393-407.
Capy, P. 1997. Evolution and impact of transposable elements. Kluwer, Dordrecht, The Netherlands.
Celniker, S. E., D. A. Wheeler, and B. Kronmiller, et al. (32 co-authors). 2002. Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence. Genome Biol. 3:RESEARCH0079.1-14.
Charlesworth, B. 1985. The populaton genetics of transposable elements. Pp. 213232 in T. Otha and K. Aoki, eds. Population genetics and molecular evolution. Japan Science Society Press, Springer-Verlag, Berlin.
Charlesworth, B. 1986. Genetic divergence between transposable elements. Genet. Res. 48:111-118.[ISI][Medline]
Charlesworth, B. 1996. Background selection and patterns of genetic diversity in Drosophila melanogaster. Genet. Res. 68:131-149.[ISI][Medline]
Fu, Y. X. 1997. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics. 147:915-925.
Gloor, G. B., N. A. Nassif, D. M. Johnson-Schlitz, C. R. Preston, and W. R. Engels. 1991. Targeted gene replacement in Drosophila via P element-induced gap repair. Science. 253:1110-1117.[ISI][Medline]
Hartl, D. L., E. R. Lozovskaya, D. I. Nurminsky, and A. R. Lohe. 1997. What restricts the activity of mariner-like transposable elements? Trends Genet. 13:197-201.[CrossRef][ISI][Medline]
Hey, J., and R. M. Kliman. 1993. Population genetics and phylogenetics of DNA sequence variation at multiple loci within the Drosophila melanogaster species complex. Mol. Biol. Evol. 10:804-822.[Abstract]
Hudson, R. R., D. D. Boos, and N. L. Kaplan. 1992. A statistical test for detecting geographic subdivision. Mol. Biol. Evol. 9:138-151.[Abstract]
Hudson, R. R., and N. L. Kaplan. 1985. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics. 111:147-164.
Hudson, R. R., and N. L. Kaplan. 1986. On the divergence of members of a transposable element family. J. Math. Biol. 24:207-215.[ISI][Medline]
Jensen, M. A., B. Charlesworth, and M. Kreitman. 2002. Patterns of genetic variation at a chromosome 4 locus of Drosophila melanogaster and D. simulans. Genetics. 160:493-507.
Kidwell, M. G. 1992. Horizontal transfer. Curr. Opin. Genet. Dev. 2:868-873.[Medline]
Kumar, A., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software, Arizona State University, Tempe, Arizona.
Langley, C. H., B. P. Lazzaro, W. Phillips, E. Heikkinen, and J. M. Braverman. 2000. Linkage disequilibria and the site frequency spectra in the su(s) and su(w(a)) regions of the Drosophila melanogaster X chromosome. Genetics. 156:1837-1852.
Merriman, P. J., C. D. Grimes, J. Ambroziak, D. A. Hackett, P. Skinner, and M. J. Simmons. 1995. S elements: a family of Tc1-like transposons in the genome of Drosophila melanogaster. Genetics. 141:1425-1438.
Nei, M. 1987. Molecular evolutionary genetics. Columbia University Press, New York.
Ohta, T. 1985. A model of duplicative transposition and gene conversion for repetitive DNA families. Genetics. 110:513-524.
Pinsker, W., E. Haring, S. Hagemann, and W. J. Miller. 2001. The evolutionary life history of P transposons: from horizontal invaders to domesticated neogenes. Chromosoma. 110:148-158.[ISI][Medline]
Plasterk, R. H., Z. Izsvak, and Z. Ivics. 1999. Resident aliens: the Tc1/mariner superfamily of transposable elements. Trends Genet. 15:326-332.[CrossRef][ISI][Medline]
Powell, J. R., and E. N. Moriyama. 1997. Evolution of codon usage bias in Drosophila. Proc. Natl. Acad. Sci. USA. 94:7784-7790.
Rozas, J. R. 1999. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics. 15:174-175.
Russo, C., N. Takezaki, and M. Nei. 1995. Molecular phylogeny and divergence times of drosophilid species. Mol. Biol. Evol. 12:391-404.[Abstract]
Tachida, H. 2000. DNA evolution under weak selection. Gene. 261:3-9.[CrossRef][ISI][Medline]
Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 123:585-595.
Takano, T. S. 1998. Rate variation of DNA sequence evolution in the Drosophila lineages. Genetics. 149:959-970.
Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876-4882.
Tu, Z., and H. Shao. 2002. Intra- and inter-specific diversity of Tc3-like transposons in nematodes and insects and implications for their evolution and transposition. Gene. 282:133-142.[CrossRef][ISI][Medline]
Wall, J. D. 1999. Recombination and the power of statistical tests of neutrality. Genet. Res. 74:65-79.[CrossRef][ISI]
Wang, W., K. Thornton, A. Berry, and M. Long. 2002. Nucleotide variation along the Drosophila melanogaster fourth chromosome. Science. 295:134-137.
Watterson, G. A. 1975. On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7:256-276.[ISI][Medline]
Wright, F. 1990. The effective number of codons' used in a gene. Gene. 87:23-29.[CrossRef][ISI][Medline]
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555-556.[Medline]