Department of Biological Sciences, Stanford University
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: transposable elements ectopic recombination copy number maintenance
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The molecular nature, population dynamics, and evolution of TEs in the Drosophila genome have been the subject of intense investigation for the last 20 years (Finnegan and Fawcett 1986; Berg and Howe 1989; Charlesworth and Langley 1989; Charlesworth, Sniegowski, and Stephan 1994; Nuzhdin 1999; Bartolome, Maside, and Charlesworth 2002; Craig et al. 2002; Kaminker et al. 2002). These studies have provided evidence that TEs in the Drosophila genome fall into a large assortment (100) of diverse families, with each family present in a limited copy number in the euchromatic portion of the genome (<150 per genome), and most copies present at very low population frequencies (<5%).
Spread of TEs in the Drosophila genome is likely to be limited both by regulation of the transposition rate and by natural selection against individual TE copies as TEs become more numerous. Regulation of transposition by either TE-driven or host-driven mechanisms undoubtedly takes place (Laski, Rio, and Rubin 1986; Kidwell 1989; Lozovskaya, Hartl, and Petrov 1995; Petrov et al. 1995; Lohe and Hartl 1996; Nuzhdin et al. 1998; Ketting et al. 1999; Aravin et al. 2001; Robert et al. 2001) and is very important in determining the population dynamics of TEs. However, by itself, the regulation of transposition rate is insufficient to explain all features of TE distributions. In particular, the low population frequencies of individual copies of TEs in Drosophila euchromatin must be due to natural selection against individual TE copies (Charlesworth and Charlesworth 1983; Kaplan and Brookfield 1983; Langley, Brookfield, and Kaplan 1983).
Several distinct but not mutually exclusive hypotheses about the nature of selection against individual TE copies have been proposed (for review see Nuzhdin 1999). First, individual TE copies may be deleterious because they disrupt genes, by affecting either their coding capacity or their regulation ("gene-disruption model") (Finnegan 1992; McDonald et al. 1997). Second, translation of TE-encoded proteins may be costly, and these proteins may generate deleterious effects by nicking chromosomes and disrupting cellular processes ("deleterious TE-product expression model") (Nuzhdin 1999). Finally, a high copy number of TEs could be harmful because ectopic recombination among numerous dispersed and heterozygous TEs generates strongly deleterious chromosome rearrangements ("ectopic recombination model") (Montgomery, Charlesworth, and Langley 1987).
Many previous studies focused on specifically testing the ectopic recombination model by looking at low versus high recombination areas of the Drosophila genome. These studies have generally (but not always) found a higher abundance of TEs in areas of low recombination (Charlesworth, Lapid, and Canada 1992a, 1992b; Hoogland and Biemont 1996). Because it is believed that areas of low recombination also experience reduced rates of ectopic recombination (Langley et al. 1988; Montgomery et al. 1991; Goldman and Lichten 1996, 2000), these results can be taken to support the ectopic recombination model. However, in addition to presumably having a lower rate of ectopic recombination, low-recombination areas also have lower densities of genes (Adams Celniker, and Holt 2000), are likely to permit lower levels of gene expression (Becker 1995; Henikoff 1995; Lu, Ma, and Eissenberg 1998; Birchler, Bhadra, and Bhadra 2000), and experience less efficient selection because of the Hill-Robertson effect (Hill and Robertson 1966; although see Charlesworth and Charlesworth 1983 for an investigation of the Hill-Robertson effect on TEs in Drosophila). Thus all three explicit selection hypotheses predict higher copy numbers and higher population frequencies of TEs in genomic regions of low recombination. The fact that this prediction is borne out empirically does not discriminate well among the three current hypotheses.
In this study we attempt to obviate these difficulties by focusing exclusively on one class of TEs (non-long terminal repeat [LTR] retroelements) in the high-recombination areas of the D. melanogaster genome. Non-LTR elements are abundant in the Drosophila genome (Berezikov, Bucheton, and Busseau 2000), and are attractive as a model system for a number of reasons. Because they evolve primarily, or possibly even exclusively, through vertical transmission (Malik, Burke, and Eickbush 1999) and cannot excise precisely from the genome, we can provisionally ignore horizontal transfer or excision in considering their population dynamics. In addition, they commonly generate 5'-truncated DOA (dead-on-arrival) elements as a natural outcome of transposition (Luan et al. 1993). These DOA elements are not transcribed and do not encode functional proteins. Thus they cannot generate potentially deleterious transcripts and proteins, allowing us to discount selection against deleterious expression of TE-encoded proteins as a force acting against individual DOA copies.
Moreover, the variable size of DOA elements at the time of transposition may allow us to discriminate between the ectopic recombination and gene-disruption hypotheses. Specifically, we reasoned that selection against the deleterious effects of ectopic recombination should affect longer elements more strongly than shorter ones, as they represent longer targets for homologous pairing (Dray and Gloor 1997). In a sense, the variation in length among newly transposed non-LTR elements allows us to study variation of the recombination rate among individual TEs, rather than among whole genomic areas. This in turn allows us to escape the confounding correlations of background recombination rate, gene density, and chromatin states in the interpretation of the results. It also allows us to simplify the analysis further by concentrating only on the high recombination areas of the D. melanogaster genome, thus reducing the probability of selective sweeps and background selection (Smith and Haigh 1974; Berry, Ajioka, and Kreitman 1991; Charlesworth, Morgan, and Charlesworth 1993; Hudson and Kaplan 1995) playing a significant role in determining the population dynamics of the studied TEs.
Our analysis provides new evidence that selection against ectopic recombination, rather than against costly expression of TE proteins or gene disruption by individual TEs, limits the spread of at least some non-LTR elements in the Drosophila genome. We also demonstrate that some non-LTR families appear to be under very weak purifying selection, in that they include many insertions that reach high population frequencies and even fixation in the D. melanogaster euchromatin. Combined with the evidence for the importance of ectopic recombination, the observation of TEs at high frequencies suggests that transposition rates vary significantly among TE families, and possibly over longer time scales for the same TE family. We discuss a hypothesis whereby transposition rate for a particular TE family can decline sharply for a period of time, leading to reduced copy numbers and ectopic recombination rates among remaining TE copies. During these periods selection acting on the remaining TE copies may be sufficiently weak to allow fixation of multiple TEs in the Drosophila euchromatin by genetic drift.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Drosophila Strains
We used the sequenced strain (y1; cn1; bw1) as a positive control in our polymerase chain reactions (PCRs). The population sampling was done in 10 American and 8 Tunisian strains. The American strains are isofemale strains (Wi1, Wi3, Wi15, Wi18, Wi35, Wi41, Wi45, Wi68, Wi69, Wi83) that were collected at the Wolfskill Orchard, Davis, CA, by Sergey Nuzhdin, and that have been further subjected to over 30 generations of brother-sister matings (S. Nuzhdin, personal communication). The African strains are isofemale lines (T1, T3, T12, T13, T17, T18, T27, T28) collected in Tunisia, Africa, by Charles Aquadro. The high level of isogenicity in all of these strains was further confirmed by the fact that 6 TE insertions that were highly polymorphic across the tested strains (frequency ranging from 42% to 75%), did not show a single case of presence/absence polymorphism within any of the strains. Overall there were 53 cases of the presence and 41 cases of the absence of one of these TEs.
Population Assays
The population frequencies of TEs were assayed by amplifying individual TEs using primer pairs in the flanking regions. The primers were designed with the aid of the Oligo 6 software package (Molecular Biology Insights, Inc., 1988). The primer sequences are provided in table 1.
DNA Sequencing
We verified the PCR identification of the fixed TEs by sequencing them in a number of strains (see Results). In each case PCR reactions were enzymatically cleaned with Exonuclease I and Shrimp Alkaline Phosphatase (1 unit of Shrimp Alkaline, 5 units of Exonuclease I, 1.2 µl of 10-fold reaction buffer-0.2 M Tris, and 0.1 M MgCl2 added to a 10 µl PCR reaction; mixture was incubated at 37°C for 45 min, and enzymes were inactivated at 70°C for 15 min), and were cycle-sequenced in quarter-reactions according to the ABI 377 sequencing protocol with Big Dye (4 µl of the cleaned PCR products, 2 µl of the Big Dye, 5 µl of the sequencing buffer [160 mM Tris Ph 9.0, 10 mM MgCl2], 0.17 µl primer previously diluted to 20 µM, 5 µl ddH2O) under standard cycling conditions (96°C for 1 min, 24 cycles of: 96°C for 10 s, 1.0°C/s to 50.0°C, 50°C for 5 s, 1.0°C/s to 60.0°C, 60.0°C for 4 min, 1.0°C/s to 96.0°C). These reactions were precipitated using ethanol and MgSO4 as described in the ABI sequencing manual, and the sequences were visualized on an ABI 377 automated sequencer. The primers used for amplifying the TEs (table 1) were also used for sequencing. The representative sequences have been deposited to GenBank under the accession numbers AY226801 through AY226814.
Estimating Intensity of Natural Selection
We make use of a diffusion approximation and the resulting sojourn time density function (Ewens 1979, Eqs. 4.224.26 & 5.47; Nagylaki 1974) to estimate the probability that an element is at a particular frequency in the population. We assume an infinite number of insertion sites, as in Kaplan and Brookfield (1983). (For an approach that also makes use of the diffusion approximation, but applies slightly different simplifying assumptions, see Charlesworth and Charlesworth [1983]). We assume that the fitness of individuals who are homozygous for the element is ,
heterozygotes have fitness
,
; and homozygotes without the element have fitness 1. Let y represent the vector {x,N,s,h}, where
, is the frequency of an element in the population of N diploid individuals. Under these assumptions, the drift and diffusion terms of the diffusion approximation of the standard Wright-Fisher model are
and
. Let
[y]
x be the expected amount of time that an element that is initially present as a single copy spends on the frequency interval I:(
) before it is absorbed at
or
. Under the standard assumptions of the diffusion approximation,
|
|
Detecting Adaptive Events Within a Maximum Likelihood Framework
To detect putative adaptive insertions of TEs, we used a likelihood ratio test for heterogeneity in the selection coefficients of transposable elements that belong to a single family. This test works by comparing two nested models of transposable element selective effects: Model 1 (M1) assumes that all TEs are subject to the same strength of selection, and Model 2 (M2) allows each TE to possess one of two different selection coefficients. Under M1, only one free parameter, s*, the selection coefficient of all TEs of a given family, is estimated from the frequency data. Under M2, three free parameters are estimated from the data: s1 and s2 are two distinct selection coefficients, and p is the proportion of elements with selection coefficient s1. The likelihood of the data given M1 is L[N,s*,h], the function described in the previous section. The likelihood of the data given M2 is calculated similarly, but with
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We identified all unambiguous copies of these four elements in the HR euchromatin of the sequenced D. melanogaster genome using Blast (table 1 and table 2), designed PCR primers to the flanking regions of each copy (table 1), and used them to assess the frequency of individual elements in 18 natural strains of D. melanogaster collected in North America (California, USA) and Africa (Tunisia) (see Materials and Methods). We used 10 different, isofemale, highly inbred strains from North America and 8 different isofemale strains from Tunisia. Because PCR failed in 11% of the cases, the number of tested strains was reduced from 18 to an average of 16 per TE insertion. The rate of PCR failure did not correlate significantly with TE length (Kendall's
,
) or vary significantly among TE families (G-test, 3 df;
). In the course of the experiments we further verified the isogenicity of the strains by finding that 6 highly polymorphic TE insertions (varying from 42% to 75% in frequency across the strains) did not show a single case of a polymorphism within the strains.
|
Furthermore, this Doc copy apparently truncates a conserved phosphotransferase-encoding gene (CG10618), suggesting that this insertion is likely to have a selective effect. The unusually high frequency of this Doc copy, the sharp variability of its frequency in different populations of D. melanogaster, and the reasonable expectation of the presence of a selective effect are all signs of the putative adaptive effect generated by the insertion of this Doc element (or of its tight linkage to an adaptive mutation in a neighboring locus). We are investigating these possibilities.
Note that the other TE families (Jockey, BS, ad X) showed no sign () of heterogeneity of selection coefficients and that all other TEs were present at indistinguishable frequencies in the US and Tunisian populations, either for TEs within each family (results not shown) or for all TEs pooled together (tests of H0: Tunisian and American frequencies are drawn from the same distribution for all TEs; Wilcoxon signed-ranks test,
; t-test, 67 df,
; t-test after the angular (arcsin
) transformation of the data, 67 df,
). Testing individual TE copies also revealed no instances of disparate frequencies in American and Tunisian populations. All TEs fixed in one population were fixed in the other, and the TEs present in intermediate frequencies in one population were also present in intermediate frequency in the other population (G-test, 1 df; P values range from 1 to 0.77). Transposable elements that were not found in one population were either not found in the other population or present in very low frequencies (maximum frequency was 2 out of 8 found for 3717BS in the Tunisia population; this frequency is not significantly different from the 0 out of 8 frequency found for the same element in the American population; Fisher's exact test,
). In the remainder of the analysis we will exclude the frequent Doc element and will discuss the data pooled across the Tunisian and American populations for the rest of the elements.
Frequency Spectra Vary Sharply Across TE Families
The frequency distribution for each family is shown in figure 1. Different element families clearly exhibit distinct frequency distributions (Kruskal-Wallis rank test, both for all TEs and for the subset of polymorphic TEs only). There appear to be two distinct kinds of frequency distributions. One is exemplified by Jockey and Doc, in which all polymorphic copies are present at low frequencies. In contrast, BS and X elements are found at all frequencies: low, intermediate, and high. Whereas we cannot distinguish the frequency distribution of Jockey from that of Doc (Mann-Whitney rank test,
; t-test, 49 df,
) or BS versus X families (Mann-Whitney rank test,
; t-test, 9 df,
), the combined distribution of Jockey and Doc elements is sharply different from the combined distribution of BS and X elements (Mann-Whitney rank test,
).
|
Estimating the Intensity of Selection
To understand the nature of population forces acting on these elements, we conducted maximum likelihood analysis of the strength of natural selection consistent with the observed frequency distributions of polymorphic elements. We assumed transpositionselection balance and used a diffusion approximation to obtain the expected frequency spectrum of elements as a function of the strength of selection. We adjusted this distribution to account for the fact that, by studying only copies that were initially found in the sequenced genome, we effectively presampled elements in proportion to their population frequencies (see Materials and Methods).
Using this probability distribution, we find evidence for strong () purifying selection acting on Doc (95% confidence interval:
) and Jockey elements (
). Such strong purifying selection is entirely consistent with previous studies of TEs in Drosophila euchromatin (Charlesworth and Langley 1989). Surprisingly, the frequency distributions of BS (
) and X elements show no signs of purifying natural selection (
). Whereas we can easily reject neutrality (
) for Doc and Jockey (log-likelihood ratio of the maximum likelihood versus likelihood value found by setting
,
in both cases), we cannot do so for BS (log-likelihood test,
) or X (log-likelihood test,
).
Fixation of TEs in the Drosophila Euchromatin
We determined that several of the elements were present in all of the tested strains (2 X, 3 BS, and 1 Jockey). We verified this observation by diagnostically sequencing each of these six elements in several strains (minimum 2 and maximum 17 strains). In each case we confirmed the presence of the identified TE copy in all tested strains with the sequence of the junctions >99% similar to that found in the sequenced Drosophila genome (data not shown). The comparison of the sequences of these TEs (taken from the D. melanogaster genome sequence database) with the full-length consensus sequences of these elements (table 1) is consistent with fixation of these elements in D. melanogaster. These 6 elements are significantly more divergent from their respective consensus sequences than the elements determined to be polymorphic (Mann-Whitney, ). Based on the level of divergence and using the rate of neutral evolution of 2% per Myr (Moriyama and Powell 1997), we can estimate that the three BS elements fixed fairly recently (
0.4 to 3 MYA). The two fixed X elements and one Jockey element appear much more ancient (3.5 to 10 Myr), with multiple point substitutions riddling their sequence (Jockey element 21% divergence from the consensus Jockey sequence; X elements 7% to 24% divergence from the consensus X sequence).
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Which model or models can account for the patterns that we see in these four non-LTR families? First of all, the TE-product expression model is not applicable in this case. Although it may in fact be very important for other types of TEs, it cannot explain strong selection acting against the Doc and Jockey elements because most of them are 5' truncated, promoter-less, and thus likely untranscribed and untranslated copies (12 out of 15 Doc and 32 out of 37 Jockey elements). Excluding the full-length elements does not affect any of the results (data not shown).
Rejecting the "Gene-Disruption" Model
In contrast, the model of selection against gene disruption should apply to these non-LTR families to the same extent as it would to any other dispersed TE family. Non-LTR families in general, and Jockey and Doc elements in particular, are known to induce visible mutations (Driver et al. 1989; O'Hare et al. 1991; White and Jacobson 1996), and so we know that they can disrupt genes. Can the "gene-disruption" model explain our results?
The data for the Jockey and Doc families are consistent with this model, assuming that there are no or extremely few truly neutral sites in the Drosophila euchromatin into which these elements can insert (Charlesworth 1991). This follows from the fact that all Jockey and Doc copies are rare (except a single element apparently affected by positive selection), and thus they must all affect neighboring genes in a subtle but decidedly () deleterious manner. However, the observation that some non-LTR families such as BS and X have many frequent elements and overall are distributed in a neutral or (nearly neutral) manner undermines this interpretation. Indeed, under the gene-disruption model we have to infer that there are plenty of truly neutral sites into which BS and X elements can insert and also that these elements avoid all of the deleterious sites into which Jockey and Doc elements insert with inevitability.
To explain these contrasting patterns across different non-LTR families under the gene-disruption model, we need to find a reason for why individual Doc and Jockey elements are significantly more deleterious than BS and X elements. On the face of it, this deleterious effect doesn't appear very likelythe four families are very similar in sequence organization as they all belong to the same Jockey-family of non-LTR elements (Malik, Burke, and Eickbush 1999). They are also similar in the mode of transposition and have similar lengths of functional elements. Nevertheless, it is possible that Doc and Jockey have a different and particularly deleterious insertion site preference, for instance exclusively inside or very near genes, while BS and X elements have a different, nondeleterious insertion site preference, for instance always far away from genes. We fail, however, to find any evidence that this is the case. Jockey and Doc copies in our sample are not on average closer to genes than BS and X elements (t-test, , 1-tailed test, 65 df; Mann-Whitney test,
) (fig. 2). Similarly there is no correlation between the population frequency of individual TEs and their distance to genes (Kendall's
,
).
|
|
To explain this based on the gene-disruption model, short Jockey elements containing a 3' portion of the reverse transcriptase coding sequence, the poly-A signal, and the poly-A tail must be inevitably deleterious even at large distances from genes, whereas the short BS and X element containing very similar sequences must generally be entirely neutral. Although we cannot formally eliminate this possibility, we consider it highly unlikely. We thus provisionally reject the gene-disruption model for these four TE families.
The Ectopic-Recombination Model Is Consistent with the Data
The final described model is the "ectopic recombination model," whereby selection acts against deleterious effects of recombination among dispersed homologous TE copies. Can this model explain our data? Because TE copies of any particular family can recombine only with other copies from the same family, selection acting on any one family of TEs is entirely independent of selection on any other family. At the same time, copies within a given family should be subject to correlated levels of selection. In this way this model predicts that variation in the strength of selection can vary systematically among different TE families. Moreover, in selectiontransposition balance, the families that transpose less frequently should equilibrate under lower copy numbers and should be subject to lower strength of selection. The frequency of ectopic recombination should be a monotonically increasing function of the copy number (Montgomery, Charlesworth, and Langley 1987) and the length of polymorphic elements (Dray and Gloor 1997). Thus selection strength could easily vary across families, with the families containing fewer and/or shorter polymorphic copies predicted to be under weaker selection.
Our results appear consistent with all of these predictions. The strength of selection varies strongly across the TE families, with Jockey and Doc families being under stronger selection than BS and X elements. With only four data points, we do not have the power to test whether BS and X have significantly fewer copy numbers than Jockey and Doc, but there appears to be a tendency in this direction (table 2). We do have enough power, however, to test whether polymorphic Jockey and Doc elements are on average much longer than polymorphic BS and X elements, and, as mentioned, indeed they are (fig. 3). The average and median lengths of polymorphic BS and X elements ( bp and 356 bp, respectively) are much shorter than the average and median lengths of polymorphic Doc and Jockey elements (
bp and 1,538 bp, respectively). These differences are significant (Mann-Whitney test,
).
Selection Discriminates Among TEs Based on Length Within Families
The ectopic recombination model explains well why selection acts family by family and why selection is stronger in the families with more numerous and longer copies. It makes additional predictions, however. In particular the length of TEs should matter not only among families but also within them. The longer TEs should be more deleterious than shorter TEs within families and should be present at lower population frequencies on average.
To test this prediction we assessed whether the length of TEs within a given family correlates negatively with the population frequency. Because there was only a single Doc element in a non-zero frequency within the sampled strains (table 1), we excluded Doc elements from this analysis. In all other cases we did find negative correlation between the length of TEs within a family and the population frequency (X, Kendall's , 1-tailed
; BS, Kendall's
, 1-tailed
; Jockey, Kendall's
, 1-tailed
). Note that we are using the current length of polymorphic TEs and our best estimate of the length at insertion for fixed TEs. This is because we are attempting to understand the parameters of the population process of frequency change prior to fixation, and most secondary deletions are likely to have happened after fixation. This analysis is conservative for our purposes.
If we limit the analysis to the polymorphic elements only, we still detect negative correlation between the current length and the population frequency for the Jockey elements (Jockey, Kendall's , 1-tailed
) and for the BS and X elements, albeit at a marginally significant level, (X, Kendall's
, 1-tailed
; BS, Kendall's
, 1-tailed
). As expected, the fixed elements were significantly shorter at the time of integration than the polymorphic elements are at present time within Jockey (Wilcoxon rank test, 1-tailed
) and BS families (Wilcoxon rank test, 1-tailed
). The fixed X elements are on average shorter than the polymorphic ones, but this difference is not significant (Wilcoxon rank test, 1-tailed
). However the small number of X elements (5) makes meaningful comparisons difficult.
Interestingly, if instead of the current length of the polymorphic TEs, we look at the inferred length of polymorphic TEs at the time of their insertion, we no longer find a negative correlation between the length of Jockey elements and their population frequency (Kendall's ,
). Closer inspection shows that out of 9 polymorphic Jockey elements at non-zero frequency in the studied strains, three were full-length elements at the time of insertion that subsequently suffered large secondary deletions (2,507 bp, 2,587 bp, and 4,891 bp). No Jockey element at zero frequency suffered such large deletions. The population frequency of these three Jockey elements is substantially higher than the rest of the Jockey elements (Mann-Whitney test,
)
Two explanations for this pattern are possible. One is that the Jockey elements present at non-zero frequencies are somewhat older on average and therefore had more time to suffer deletions than the elements present at zero frequencies. However, the analysis of point substitutions does not lend much support to this proposition. Jockey elements found in zero frequency and those found in non-zero frequency are not significantly different in their divergence from the consensus sequence measured in the number of nucleotide differences per nucleotide (G-test, 1 df, ). The other possible explanation is that full-length elements cannot reach non-zero frequencies observable in our sample (>5%), unless they become substantially shorter through secondary deletions. In this way secondary deletions may lower the strength of purifying selection acting on the long elements. Further supporting this interpretation is the finding that the rate of large (>400 bp) deletions relative to the rate of nucleotide substitution is substantially higher among non-zero frequency Jockey elements than among the zero frequency ones (Fisher's exact test, 1 df,
).
Ectopic Recombination as an Example of Selection Based on Homologous Interactions
The overall result of this study is to suggest that, at least in these four families of TEs, selection does not operate on the individual effects of TEs on the neighboring genes, but rather operates at the level of families of homologous TEs. Moreover, selection gets stronger with the increase of the copy number and length of individual TE copies. All of these features are consistent with selection acting against the products of ectopic recombination. However, they are also consistent with selection based on other homology-dependent interactions. The presence of homologous DNA and RNA sequences in the cell leads to a multitude of profound phenotypic effects affecting chromatin state and levels of gene expression (Henikoff and Dreesen 1989; Fanti et al. 1998; Ketting et al. 1999; Pal-Bhadra, Bhadra, and Birchler 1999, 2002; Sass and Henikoff 1999; Wu and Morris 1999; Aravin et al. 2001). Thus it is entirely possible that selection against many dispersed, long, homologous TE copies is mediated not (or not exclusively) by ectopic recombination, but through some other homology-dependent, epigenetic effect. Our analysis does not distinguish among these possibilities.
Transposition-Selection Balance with Variable Transposition Rates Between Families or Within Families over Time
The variation in the strength of selection among TE families in our study most likely reflects variation in the family-specific rate of transposition. Indeed, in transpositionselection balance the copy number equilibrates at a level where the rate of TE elimination by selection matches the rate of transposition. The stronger level of selection acting against TEs within a family thus implies a higher rate of transposition of TEs within that family. Thus the most straightforward way to interpret our data is to postulate a higher rate of transposition for Jockey and Doc elements than for BS and X elements.
It is possible that Jockey and Doc are just more active TEs than BS and X. However, it is also possible that transposition rate within a family varies through time and we have simply caught Jockey and Doc during their active phase, whereas we have caught BS and X during their slow phase. Under this hypothetical scenario, BS and X used to transpose at high rates (similar to those currently observed for Jockey and Doc), were present in high copy numbers, and were under strong selection based on ectopic recombination (or some other homology-dependent selective mechanism). We know that transposition rate can itself evolve, and the presence of polymorphic modifiers of transposition in Drosophila populations has been documented (Nuzhdin et al. 1998). It is possible that fixation of repressors of transposition for BS and X families led to a sharp reduction of transposition rates, leading to a reduction of the copy number of the BS and X elements as a result of drift and purifying selection. Eventually, the copy number became sufficiently low, and the strength of selection sufficiently weak, to allow the shorter elements to drift to higher frequencies. In this scenario the rate of transposition must have remained low for a long enough time (on the order of 4Negenerations) that many short elements were able to reach high population frequencies, and some even had enough time to reach fixation.
Some features of our data hint that this second model may be a reasonable possibility. For instance, if BS elements have always been transposing at low rates with many copies reaching fixation, we should find many fixed copies of different ages. In fact, we do find three fixed copies, but all of them are fairly young (inserted <3 MYA based on the level of divergence from the consensus sequence). One explanation is that the older copies have all been lost or have become unrecognizable through frequent deletions (Petrov, Lozovskaya, and Hartl 1996; Pritchard and Schaeffer 1997; Petrov et al. 1998; Petrov and Hartl 1998; Ramos-Onsins and Aguade 1998; Robin et al. 2000; Blumenstiel, Hartl, and Lozovsky 2002; Petrov 2002). It is also possible, however, that BS elements were transposing fast and thus were not fixing through drift prior to 3 MYA. We also found a single fixed Jockey element that was fixed approximately 10 MYA (based on its 21.4% divergence from the consensus sequence). If the rate of transposition and the strength of selection against Jockey elements were both as high 10 MYA as they are today, we need to postulate that this Jockey element was swept to fixation by positive selection based on its local effect. Without positive selection, probability of fixation of new mutations with Nes of -26 (as estimated for Jockey) is astronomically small (7 x 10-49). Its small size (152 bp) is simply a coincidence under the scenario of positive selection, but it would be naturally predicted under the scenario in which it drifted to fixation during a period of low transposition rates of Jockey elements.
These considerations are clearly insufficient to distinguish between the model where transposition rate varies mostly among TE families and stays relatively constant within a family, and the model where transposition rate varies sharply for a particular family through time, with long periods ( 4 Ne generations) of high and low transposition rates. However, future studies could resolve this question. In particular studies of the same families in multiple Drosophila species can establish whether the same families (such as Jockey and Doc) are always present in high copy numbers and low population frequencies and whether the reverse is true for other families (such as BS and X). Studies of age distribution of fixed TEs for more TE families may also shed light on this issue.
Selection at the Level of Ectopic Recombination and Genome Evolution
If the neutral attainment of intermediate population frequency and even fixation is a consistent feature of some TE families or a periodic feature of many TE families, why haven't we seen more fixed and high-frequency TEs before? The probable explanation is that mostly short elements reach fixation. In view of the recent demonstration that a high rate of DNA loss through small (<400 bp) deletions affects most or all sequences in the Drosophila genome (Petrov, Lozovskaya, and Hartl 1995; Pritchard and Schaeffer 1997; Petrov et al. 1998; Petrov and Hartl 1998; Ramos-Onsins and Aguade 1998; Robin et al. 2000; Blumenstiel, Hartl, and Lozovsky 2002; Petrov 2002), it seems likely that fixed elements have a relatively short persistence time.
However, even the high-frequency polymorphic or recently fixed short TEs that have not yet been deleted, may have been overlooked in the past. Most surveys of TE frequencies have been conducted either by conducting in situ hybridization with polytene chromosomes or by surveying particular genomic regions. In situ hybridization is quite inefficient when dealing with very short regions of homology, whereas population surveys of particular genomic regions bias the analysis in favor of high-copy polymorphic TEs, as these are more likely to be captured segregating in a predetermined region. As the families containing high-frequency polymorphic copies are likely to be present in low copy numbers, they are going to be underrepresented in population samples based on predefined chromosomal regions.
Despite the relative paucity of detectable fixed TEs, fixation of TEs may be of great importance in Drosophila genome size evolution. The observation of TE fixation, and the possibility that it occurs from time to time for a large number TE families, may provide the counterbalancing force to persistent DNA loss through frequent small deletions (Petrov, Lozovskaya, and Hartl 1996; Pritchard and Schaeffer 1997; Petrov et al. 1998; Petrov and Hartl 1998; Ramos-Onsins and Aguade 1998; Robin et al. 2000; Blumenstiel, Hartl, and Lozovsky 2002; Petrov 2002).
There also may be implications of these findings for the evolution of the transposable elements themselves. If the principal deleterious effect of TEs is due to ectopic recombination, multiple TEs should be able to coexist without burdening the host, provided they do not recombine with each other. The risk of ectopic recombination should therefore impose a strong selective pressure for rapid sequence divergence of TEs. This may be one reason for the evolution of such a large number of TE families in the Drosophila lineage (Charlesworth and Langley 1989).
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Adams, M. D., S. E. Celniker, and R. A. Holt, et al. (194 co-authors). 2000. The genome sequence of Drosophila melanogaster. Science 287:2185-2195.
Aravin, A. A., N. M. Naumova, A. V. Tulin, V. V. Vagin, Y. M. Rozovsky, and V. A. Gvozdev. 2001. Double-stranded RNA-mediated silencing of genomic tandem repeats and transposable elements in the D. melanogaster germline. Curr. Biol. 11:1017-1027.[CrossRef][ISI][Medline]
Bartolome, C., X. Maside, and B. Charlesworth. 2002. On the abundance and distribution of transposable elements in the genome of Drosophila melanogaster. Mol. Biol. Evol. 19:926-937.
Becker, P. B. 1995. Drosophila chromatin and transcription. Semin. Cell Biol. 6:185-190.[ISI][Medline]
Bender, W., P. Spierer, and D. S. Hogness. 1983. Chromosomal walking and jumping to isolate DNA from the Ace and rosy loci and the Bithorax complex in Drosophila melanogaster. J. Mol. Biol. 168:17-33.[ISI][Medline]
Berezikov, E., A. Bucheton, and I. Busseau. 2000. A search for reverse transcriptase-coding sequences reveals new non-LTR retrotransposons in the genome of Drosophila melanogaster. Genome Biol. 1:research 0012.10012.15.
Berg, D. E., and M. M. Howe. 1989. Mobile DNA. American Society for Microbiology, Washington, D.C.
Berry, A. J., J. W. Ajioka, and M. Kreitman. 1991. Lack of polymorphism on the Drosophila fourth chromosome resulting from selection. Genetics 129:1111-1117.
Birchler, J. A., M. P. Bhadra, and U. Bhadra. 2000. Making noise about silence: repression of repeated genes in animals. Curr. Opin. Genet. Dev. 10:211-216.[CrossRef][ISI][Medline]
Blumenstiel, J. P., D. L. Hartl, and E. R. Lozovsky. 2002. Patterns of insertion and deletion in contrasting chromatin domains. Mol. Biol. Evol 19:2211-2225.
Charlesworth, B. 1991. Transposable elements in natural populations with a mixture of selected and neutral insertion sites. Genet. Res. 57:127-134.[ISI][Medline]
Charlesworth, B. 1996. Background selection and patterns of genetic diversity in Drosophila melanogaster. Genet. Res. 68:131-149.[ISI][Medline]
Charlesworth, B., and D. Charlesworth. 1983. The population dynamics of transposable elements. Genet. Res. 42:1-27.[ISI]
Charlesworth, B., and C. H. Langley. 1989. The population genetics of Drosophila transposable elements. Annu. Rev. Genet. 23:251-287.[CrossRef][ISI][Medline]
Charlesworth, B., A. Lapid, and D. Canada. 1992a. The distribution of transposable elements within and between chromosomes in a population of Drosophila melanogaster. I. Element frequencies and distribution. Genet. Res. 60:103-114.[ISI][Medline]
Charlesworth, B., A. Lapid, and D. Canada. 1992b. The distribution of transposable elements within and between chromosomes in a population of Drosophila melanogaster. II. Inferences on the nature of selection against elements. Genet. Res. 60:115-130.[ISI][Medline]
Charlesworth, B., M. T. Morgan, and D. Charlesworth. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303.
Charlesworth, B., P. Sniegowski, and W. Stephan. 1994. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371:215-220.[CrossRef][ISI][Medline]
Craig, N. L., R. Craigie, M. Gellert, and A. M. Lambowitz. 2002. Mobile DNA II. American Society of Microbiology, Washington D.C., p. 1204.
Dray, T., and G. B. Gloor. 1997. Homology requirements for targeting heterologous sequences during P-induced gap repair in Drosophila melanogaster. Genetics 147:689-699.
Driver, A., S. F. Lacey, T. E. Cullingford, A. Mitchelson, and K. O'Hare. 1989. Structural analysis of Doc transposable elements associated with mutations at the white and suppressor of forked loci of Drosophila melanogaster. Mol. Gen. Genet. 220:49-52.[ISI][Medline]
Ewens, W. H. 1979. Mathematical population genetics. Springer-Verlag, New York.
Fanti, L., D. R. Dorer, M. Berloco, S. Henikoff, and S. Pimpinelli. 1998. Heterochromatin protein 1 binds transgene arrays. Chromosoma 107:286-292.[CrossRef][ISI][Medline]
Finnegan, D. J. 1992. Transposable elements. Curr. Opin. Genet. Dev. 2:861-867.[Medline]
Finnegan, D. J., and D. H. Fawcett. 1986. Transposable elements in Drosophila melanogaster. Pp. 162 in N. MacLean, ed. Oxford Surveys of Eukaryotic Genes. Oxford University Press, Oxford.
Geyer, P. K., C. Spana, and V. G. Corces. 1986. On the molecular mechanism of gypsy-induced mutations at the yellow locus of Drosophila melanogaster. EMBO J. 5:2657-2662.[Abstract]
Goldman, A. S., and M. Lichten. 1996. The efficiency of meiotic recombination between dispersed sequences in Saccharomyces cerevisiae depends upon their chromosomal location. Genetics 144:43-55.
Goldman, A. S., and M. Lichten. 2000. Restriction of ectopic recombination by interhomolog interactions during Saccharomyces cerevisiae meiosis. Proc. Natl. Acad. Sci. USA 97:9537-9542.
Harrison, D. A., P. K. Geyer, C. Spana, and V. G. Corces. 1989. The gypsy retrotransposon of Drosophila melanogaster: mechanisms of mutagenesis and interaction with the suppressor of hairy-wing locus. Dev. Genet. 10:239-248.[ISI][Medline]
Henikoff, S. 1995. Gene silencing in Drosophila. Curr. Top. Microbiol. Immunol. 197:193-208.[ISI][Medline]
Henikoff, S., and T. D. Dreesen. 1989. Trans-inactivation of the Drosophila brown gene: evidence for transcriptional repression and somatic pairing dependence. Proc. Natl. Acad. Sci. USA 86:6704-6708.[Abstract]
Hill, W. G., and A. Robertson. 1966. The effect of linkage on limits to artificial selection. Genet. Res. 8:269-294.[ISI][Medline]
Hoogland, C., and C. Biemont. 1996. Chromosomal distribution of transposable elements in Drosophila melanogaster: test of the ectopic recombination model for maintenance of insertion site number. Genetics 144:197-204.
Hudson, R. R., and N. L. Kaplan. 1995. Deleterious background selection with recombination. Genetics 141:1605-1617.
Kaminker, J. S., C. M. Bergman, and B. Kronmiller et al., (12 co-authors). 2002. The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective. Genome Biol. 3: research0084-4.
Kaplan, N. L., and J. F. Brookfield. 1983. Transposable element in mendelian populations. III. Statistical results. Genetics 104:485-495.
Ketting, R. F., T. H. Haverkamp, H. G. van Luenen, and R. H. Plasterk. 1999. Mut-7 of C. elegans, required for transposon silencing and RNA interference, is a homolog of Werner syndrome helicase and RNaseD. Cell 99:133-141.[ISI][Medline]
Kidwell, M. G. 1989. Regulatory aspects of the expression of P-M hybrid dysgenesis in Drosophila. Pp. 183194 in M. E. Lambert, J. F. McDonald, I. B. Weinstein, eds. Transposable elements as mutagenic agents. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
Kidwell, M. G. 2002. Transposable elements and the evolution of genome size in eukaryotes. Genetica 115:49-63.[CrossRef][ISI][Medline]
Kreitman, M. 1983. Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304:412-417.[ISI][Medline]
Langley, C. H., J. F. Brookfield, and N. L. Kaplan. 1983. Transposable elements in mendelian populations. I. A theory. Genetics 104:457-471.
Langley, C. H., E. Montgomery, R. Hudson, N. Kaplan, and B. Charlesworth. 1988. On the role of unequal exchange in the containment of transposable element copy number. Genet. Res. 52:223-235.[ISI][Medline]
Laski, F. A., D. C. Rio, and G. M. Rubin. 1986. Tissue specificity of Drosophila P element transposition is regulated at the level of mRNA splicing. Cell 44:7-19.[ISI][Medline]
Lohe, A. R., and D. L. Hartl. 1996. Autoregulation of mariner transposase activity by overproduction and dominant-negative complementation. Mol. Biol. Evol. 13:549-555.[Abstract]
Lozovskaya, E. R., D. L. Hartl, and D. A. Petrov. 1995. Genomic regulation of transposable elements in Drosophila. Curr. Opin. Genet. Dev 5:768-773.
Lu, B. Y., J. Ma, and J. C. Eissenberg. 1998. Developmental regulation of heterochromatin-mediated gene silencing in Drosophila. Development 125:2223-2234.
Luan, D. D., M. H. Korman, J. L. Jacubczak, and T. H. Eickbush. 1993. Reverse transcriptase of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell 72:595-605.[ISI][Medline]
Malik, H. S., W. D. Burke, and T. H. Eickbush. 1999. The age and evolution of non-LTR retrotransposable elements. Mol. Biol. Evol. 16:793-805.[Abstract]
McDonald, J. F., L. V. Matyunina, S. Wilson, I. K. Jordan, N. J. Bowen, and W. J. Miller. 1997. LTR retrotransposons and the evolution of eukaryotic enhancers. Genetica 100:3-13.[CrossRef][ISI][Medline]
Mizrokhi, L. I., L. A. Obolenkova, A. F. Priimagi, Y. V. Ilyin, T. I. Gerasimova, and G. P. Georgiev. 1985. The nature of unstable insertion mutations and reversions in the locus cut of Drosophila melanogaster: molecular mechanism of transposition. EMBO J. 4:3781-3787.[ISI]
Montgomery, E., B. Charlesworth, and C. H. Langley. 1987. A test for the role of natural selection in the stabilization of transposable element copy number in a population of Drosophila melanogaster. Genet. Res. 49:31-41.[ISI][Medline]
Montgomery, E. A., S. M. Huang, C. H. Langley, and B. H. Judd. 1991. Chromosome rearrangement by ectopic recombination in Drosophila melanogaster: genome structure and evolution. Genetics 129:1085-1098.
Moriyama, E. N., and J. R. Powell. 1997. Synonymous substitution rates in Drosophila: mitochondrial versus nuclear genes. J. Mol. Evol. 45:378-391.[ISI][Medline]
Nagylaki, T. 1974. The moments of stochastic integrals and the distribution of sojourn times. Proc. Natl. Acad. Sci. USA 71:746-749.[Abstract]
Nuzhdin, S. V. 1999. Sure facts, speculations, and open questions about the evolution of transposable element copy number. Genetica 107:129-137.[CrossRef][ISI][Medline]
Nuzhdin, S. V., E. G. Pasyukova, E. A. Morozova, and A. J. Flavell. 1998. Quantitative genetic analysis of copia retrotransposon activity in inbred Drosophila melanogaster lines. Genetics 150:755-766.
O'Hare, K., M. R. Alley, T. E. Cullingford, A. Driver, and M. J. Sanderson. 1991. DNA sequence of the Doc retroposon in the white-one mutant of Drosophila melanogaster and of secondary insertions in the phenotypically altered derivatives white-honey and white-eosin. Mol. Gen. Genet. 225:17-24.[ISI][Medline]
Pal-Bhadra, M., U. Bhadra, and J. A. Birchler. 1999. Cosuppression of nonhomologous transgenes in Drosophila in volves mutually related endogenous sequences. Cell 99:35-46.[ISI][Medline]
Pal-Bhadra, M., U. Bhadra, and J. A. Birchler. 2002. RNAi related mechanisms affect both transcriptional and posttranscriptional transgene silencing in Drosophila. Mol. Cell 9:315-327.[ISI][Medline]
Petrov, D. A. 2002. DNA loss and evolution of genome size in Drosophila. Genetica 115:81-91.[CrossRef][ISI][Medline]
Petrov, D. A., Y.-C. Chao, E. C. Stephenson, and D. L. Hartl. 1998. Pseudogene evolution in Drosophila suggests a high rate of DNA loss. Mol. Biol. Evol. 15:1562-1567.
Petrov, D. A., and D. L. Hartl. 1998. High rate of DNA loss in the Drosophila melanogaster and Drosophila virilis species groups. Mol. Biol. Evol. 15:293-302.[Abstract]
Petrov, D. A., E. R. Lozovskaya, and D. L. Hartl. 1996. High intrinsic rate of DNA loss in Drosophila [see comments]. Nature 384:346-349.[CrossRef][ISI][Medline]
Petrov D. A., J. L. Schutzman, D. L. Hartl, and E. R. Lozovskaya. 1995. Diverse transposable elements are mobilized in hybrid dysgenesis in Drosophila virilis. Proc. Natl. Acad. Sci. USA 92:8050-8054.[Abstract]
Pritchard, J. K., and S. W. Schaeffer. 1997. Polymorphism and divergence at a Drosophila pseudogene locus. Genetics 147:199-208.
Ramos-Onsins, S., and M. Aguade. 1998. Molecular evolution of the Cecropin multigene family in Drosophila. functional genes vs. pseudogenes. Genetics 150:157-171.
Robert, V., N. Prud'homme, A. Kim, A. Bucheton, and A. Pelisson. 2001. Characterization of the flamenco region of the Drosophila melanogaster genome. Genetics 158:701-713.
Robin, G. C., R. J. Russell, D. J. Cutler, and J. G. Oakeshott. 2000. The evolution of an -esterase pseudogene inactivated in the Drosophila melanogaster lineage. Mol. Biol. Evol. 17:563-575.
Sass, G. L., and S. Henikoff. 1999. Pairing-dependent mislocalization of a Drosophila brown gene reporter to a heterochromatic environment. Genetics 152:595-604.
Schug, M. D., C. M. Hutter, K. A. Wetterstrand, M. S. Gaudette, T. F. Mackay, and C. F. Aquadro. 1998. The mutation rates of di-, tri- and tetranucleotide repeats in Drosophila melanogaster. Mol. Biol. Evol. 15:1751-1760.
Scott, K. C., A. D. Taubman, and P. K. Geyer. 1999. Enhancer blocking by the Drosophila gypsy insulator depends upon insulator anatomy and enhancer strength. Genetics 153:787-798.
Smith, J. M., and J. Haigh. 1974. The hitch-hiking effect of a favourable gene. Genet. Res. 23:23-35.[ISI][Medline]
Smith, P. A., and V. G. Corces. 1992. The suppressor of Hairy-wing binding region is required for gypsy mutagenesis. Mol. Gen. Genet. 233:65-70.[ISI][Medline]
Tudor, M., A. J. Davis, M. Feldman, M. Grammatikaki, and K. O'Hare. 2001. The X element, a novel LINE transposable element from Drosophila melanogaster. Mol. Genet. Genomics 265:489-496.[CrossRef][ISI][Medline]
Udomkit, A., S. Forbes, G. Dalgleish, and D. J. Finnegan. 1995. BS: a novel LINE-like element in Drosophila melanogaster. Nucleic Acids Res. 23:1354-1358.[Abstract]
White, L. D., and J. W. Jacobson. 1996. Insertion of the retroposable element, Jockey, near the Adh gene of Drosophila melanogaster is associated with altered gene expression. Genet. Res. 68:203-209.[ISI][Medline]
Wu, C. T., and J. R. Morris. 1999. Transvection and other homology effects. Curr. Opin. Genet. Dev. 9:237-246.[CrossRef][ISI][Medline]