Horizontal Transfer and Selection in the Evolution of P Elements

Joana C. Silva3,* and Margaret G. Kidwell*{dagger}

*Interdisciplinary Program in Genetics and
{dagger}Department of Ecology and Evolutionary Biology, University of Arizona


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
The roles of selection and horizontal transfer in the evolution of the canonical subfamily of P elements were studied in the saltans and willistoni species groups of the genus Drosophila (subgenus Sophophora). We estimate that the common ancestor of the canonical P subfamily dates back 2–3 Myr at the most, despite the much older age (more than 40 Myr) of the P family as a whole. The evolution of the canonical P subfamily is characterized by weak selection at nonsynonymous sites. These sites have evolved at three quarters the rate of synonymous sites, in which no selective constraints were detected. Their recent horizontal transfer best explains the high degree of similarity among canonical P elements from the saltans and willistoni species groups. These results are consistent with a model of P-element evolution in which selective constraints are imposed at the time of horizontal transfer. Furthermore, it is estimated that the spread and diversification of the canonical subfamily involved a minimum of 11 horizontal transfer events among the 18 species surveyed within the past 3 Myr. The presence of multiple P subfamilies in the saltans and willistoni species groups is likely to be the result of multiple invasions that have previously swept through these taxa in a succession of horizontal transfer events. These results suggest that horizontal transfer among eukaryotes might be more common than anticipated.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
Transposable elements (TEs) are DNA segments that possess the ability to move, or transpose, to new sites within the genome of a host species (Berg and Howe 1989Citation ). There are an enormous variety of TEs, and they have been grouped into families according to structure and degree of sequence similarity. These families have been divided into two major classes on the basis of their transposition mechanisms (Finnegan 1992Citation ): during transposition of class I elements, a new DNA copy is made by reverse transcription of an RNA intermediate; class II elements transpose directly from DNA to DNA. TEs are present in all species in which they have been sought and often comprise a large proportion of the host's genome, in some instances reaching over 50% (SanMiguel et al. 1996Citation ).

Despite their pervasiveness, many unanswered questions regarding TE evolution remain. Two central questions address the relative contributions of vertical and horizontal transmission to the taxonomic distribution of TE families and the role that positive selection plays in the long-term maintenance of TE functionality. Some TE families have widespread distributions, which often extend across genus, family, and even phylum boundaries (e.g., Stacey et al. 1986Citation ; Daniels et al. 1990Citation ; Montchamp-Moreau et al. 1993Citation ; Robertson et al. 1998Citation ). The distribution of a TE family in multiple taxa can result from vertical transmission of a TE present in their common ancestor or from the TE's spread among those taxa through horizontal transfer only. These are the two opposite extremes in a continuum of scenarios that differ with regard to the relative contributions of horizontal and vertical transmission. It has been argued that horizontal transfer occurs rarely, and that the transmission of TEs is predominantly vertical (Capy et al. 1997Citation , p. 3). However, horizontal transfer has been proposed to be an integral part of the life cycle of some eukaryotic class II TE families, such as the mariner-like and P families (Clark et al. 1995Citation ; Robertson and Lampe 1995Citation ). Recently, class I elements have been unquestionably shown to transfer horizontally as well (Jordan, Matyunina, and McDonald 1999Citation ), an event with potentially wide-ranging implications (Flavell 1999Citation ). Presently, it is unclear how frequent a process horizontal transfer among eukaryotes really is.

Horizontal transfer has traditionally been inferred when the high degree of similarity between TE sequences is impossible to reconcile with the long divergence time of their respective host species (e.g., Daniels et al. 1990Citation ). Other observations can help corroborate that inference, such as the incongruence between TE and host phylogenies or the absence of the TE in question from taxa closely related to that into which the TE was supposedly transferred horizontally (e.g., Clark et al. 1995Citation ). The identification of horizontal transfer events is not straightforward, as alternative explanations are often hard to dismiss conclusively, especially when TEs from closely related taxa are compared. Such explanations include TE ancestral polymorphism coupled with independent assortment of copies into the descendant species, inequality of substitution rates in TE sequences in different species, and the stochastic loss of TEs from a few taxa (Capy et al. 1997Citation , pp. 130–135).

There is a situation, however, in which the inference of horizontal transfer events is less problematic. By comparing the divergence of TE nucleotide sequences with those observed for host genes evolving under similar or stronger selective constraints, a horizontal transfer event can be inferred whenever the divergence among TE sequences is significantly lower than that observed for the host genes. This procedure requires a careful assessment of the selective constraints on the TEs under study, as those constraints can vary by over an order of magnitude between and within TE families (e.g., Eickbush et al. 1995Citation ; Robertson and Lampe 1995Citation ; McAllister and Warren 1997Citation ; Witherspoon et al. 1997Citation ). Here, we used this method to estimate the frequency of horizontal transfer events in the canonical subfamily of the P elements.

The P family belongs to class II, as its elements transpose directly from DNA to DNA. A complete canonical P element was first isolated from Drosophila melanogaster. This 2,907-bp element has four open reading frames (ORFs), which encode a sequence-specific DNA-binding transposase that catalyzes transposition of the element (O'Hare and Rubin 1983Citation ). The first three ORFs are also part of a truncated transcript that encodes a transposition repressor (Misra and Rio 1990Citation ).

The taxonomic distribution of the P family of elements is patchy and relatively restricted. These elements are most prevalent in the subgenus Sophophora of the genus Drosophila (Daniels et al. 1990Citation ). The subgenus Sophophora consists of four major species groups: the D. melanogaster and Drosophila obscura groups, which diverged in the Old World, and the Drosophila willistoni and Drosophila saltans groups, which are restricted to the New World (Throckmorton 1975Citation ). The 200 P-element sequences obtained to date have been organized into 16 subfamilies (Hagemann, Miller, and Pinsker 1994Citation ; Clark and Kidwell 1997Citation ). Each subfamily corresponds to a monophyletic group of elements obtained from species of the same species group, with the exception of monophyletic elements from the closely related saltans and willistoni species groups, which are grouped into the same subfamily (Clark and Kidwell 1997Citation ).

With one notable exception, the canonical subfamily of P elements appears to be restricted to the sophophoran New World species groups, saltans and willistoni, two sister taxa whose common ancestor dates back at least 15 Myr (Daniels et al. 1990Citation ; Clark and Kidwell 1997Citation ). The exceptional canonical elements found in D. melanogaster have been explained by a recent horizontal transfer event (Daniels et al. 1990Citation ). The canonical subfamily is composed of closely related elements with a range of divergence between 0% and 10% at the sequence level (Clark et al. 1995Citation ). The reason for this low divergence remains unclear. It might be the result of strong selective constraints, or it could be explained by multiple horizontal transfer events occurring both within and between the two species groups.

In the present study, we assess the roles played by selection and horizontal transfer in determining the similarity among canonical P elements from the saltans and willistoni species groups. The ratio of nonsynonymous to synonymous substitutions, as well as the amount of codon bias, in P elements allows us to determine the strength of selective constraints acting on the canonical elements. We compare the divergence of the P elements with that of host genes evolving under similar or stronger constraints than the P elements to determine the number of horizontal transfer events that are necessary to explain the current taxonomic distribution of canonical P elements. The relevance of selection and horizontal transfer in the history of the P canonical subfamily is discussed in the context of the evolution and long-term survival of the P family of TEs.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
DNA Sequencing
Genomic DNAs from Scaptomyza pallida and Scaptomyza elmoi were obtained from P. M. O'Grady and J. B. Clark, respectively. The second exon of the Adh gene was PCR-amplified under standard conditions; the primers map to the 3' end of exon 1 and the 5' end of exon 3 (O'Grady, Clark, and Kidwell 1998Citation ). The product was cloned into a TA cloning vector (Invitrogen), and both strands of a single clone were sequenced for each species, using an ABI 377 automated sequencer.

Sources of Data Sets
Four data sets of DNA sequences were used for this study, a P-element data set and three others which correspond to host nuclear loci: alcohol dehydrogenase (Adh), period (per), and Cu/Zn superoxide dismutase (Sod). With the exception of the Adh sequences for Scaptomyza, these data were all gathered from the literature and represent all sequence information available in GenBank for at least one species in each of the two species groups.

The P-element data set consisted of 52 P-element partial sequences obtained from the literature (see table 1 ). Four were sampled from species of the genus Scaptomyza: Spallida18 and Spallida02 (Simonelig and Anxolabéhère 1991Citation ) and Selmoi4 and Selmoi12 (unpublished data). The remaining 48 elements corresponded to the canonical subfamily described by Clark et al. (1995)Citation , except that we excluded Daustrosaltans22 (identical to Daustrosal21), Dlusaltans37 (identical to Dlusal34), and Dsturtevanti24, Dsturtevanti25, and Dsturtevanti26 (identical to Dsturt13). The sequence DnebulosaN10 was obtained from Lansman et al. (1987)Citation . The sequences were 429 nt long and mapped to positions 1328–1757 in ORF2 of the canonical P element (O'Hare and Rubin 1983Citation ).


View this table:
[in this window]
[in a new window]
 
Table 1 Taxa Used in the Study

 
The Adh data set had 27 partial sequences of 405 nt each, which corresponded to the complete second exon of the D. melanogaster Adh gene (table 1 ). Sequences for the saltans and willistoni species groups were obtained from O'Grady (1998)Citation and O'Grady, Clark, and Kidwell (1998)Citation , with the exception of the D. willistoni sequence (Anderson, Carew, and Powell 1993Citation ). Adh sequences for S. pallida and S. elmoi were obtained for this paper.

The per data set had 39 sequences (table 1 ), of which 38 belonged to the willistoni species group (Gleason and Powell 1997Citation ) and only one belonged to the saltans species group (Peixoto et al. 1993Citation ). The willistoni group sequences consisted of a 1,231-nt fragment mapping to amino acids 518–919 in exon 5 (Citri et al. 1987Citation ). The D. saltans sequence consisted of a 198-nt fragment mapping to amino acids 582–649, with a 6-bp gap corresponding to amino acids 618–619.

The Sod data set consisted of two sequences, one from D. willistoni and one from D. saltans (Kwiatowski et al. 1994a, 1994bCitation ). The D. willistoni sequence was 879 nt long; it consisted of a 417-nt intron and two exons of 66 and 396 nt in length. The D. saltans sequence had a shorter intron (352 nt), and the first 23 nt of the first exon were missing.

The Adh data set was the most complete, with sequences for all species listed in table 1 . It was therefore used to estimate the species phylogeny and its branch lengths. All three host genes were used to estimate DNA sequence divergence between species, species subgroups, and/or species groups. The per data were unbalanced (with 38 sequences available for the willistoni group and only 1 for the saltans group), and the Sod data set was much smaller than the other two, with only one sequence available for each species group. This is reflected in the standard deviation associated with the mean number of substitutions, which takes into account sequence length, number of sequences, and phylogeny (Nei and Jin 1989Citation ).

Sequence Analyses
Alignment of P element, Adh, and Sod sequences was done by eye and was straightforward. Sixteen alignment gaps were required in the P-element data set to compensate for the presence of indels. The per sequences were aligned as in Gleason and Powell (1997)Citation .

Estimation of dN and dS
The numbers of synonymous substitutions per synonymous site, dS, and nonsynonymous substitutions per nonsynonymous site, dN, were estimated using the method of Nei and Gojobori (1986)Citation . Standard deviations for the average dS and dN between groups were calculated as in Nei and Jin (1989)Citation using the program dNdSwq (obtained from Jack da Silva). In order to preserve the reading frame of the functional canonical P element, the alignment gaps introduced in the P-element data set that represented insertions relative to the canonical P element were deleted prior to estimation of dS and dN. The three termination codons found (in sequences Dsalt51, DwilliS1, and Dsuci21) were recoded as missing data. The overall dN : dS ratio was also determined using a phylogenetics-oriented approach on a subset of the most likely topologies for the canonical P-element sequences (see below). The likelihood of each topology given the data was determined under two models, one in which dN/dS was free to vary, and a submodel in which dN/dS was set to unity. A significant difference in likelihood under the two models, identified with a likelihood ratio test, indicates that a dN : dS ratio different from 1 provides a significantly better fit to the data. This analysis was done using a codon-based model of substitution (Yang, Goldman, and Friday 1994Citation ) implemented in PAML (Yang 1998Citation ). Because of the inability of the program to handle deletions, two sets of analyses were performed. In one, the 14 P sequences with deletions were eliminated from the data set. In the other, codon positions containing indels were eliminated.

Estimation of Codon Bias and G+C in Synonymous Sites
Two indices of codon bias were determined for each sequence: the effective number of codons (Nc; Wright 1990Citation ) and the codon bias index (CBI; Morton 1993Citation ). Nc varies between 21 for maximum codon bias (when only one codon is used per amino acid) and 61 for minimum codon bias (synonymous codons for each amino acid used at similar frequencies). A CBI value of 0 corresponds to no bias, and a value of 1 corresponds to maximum bias.

The frequency of G and C nucleotides in third positions in synonymous codons was determined as the proportion of C- and G-ending codons, with the exclusion of termination, methionine, and tryptophan codons.

Inferring Horizontal Transfer
Horizontal transfer can be broadly defined as the introduction of genetic material into a species from which it had been previously absent. Potential mechanisms for horizontal transfer include vector-mediated transmission and introgression (Capy et al. 1997Citation , pp. 141–143). In this study, horizontal transfer was investigated by comparing TE divergence with that of host genes across each node of the host phylogeny. A horizontal transfer event can be inferred when the estimated TE divergence is significantly lower than that of host genes under similar or higher levels of selective constraints than those operating on the TEs themselves.

DNA sequences from three host loci were used in this study: alcohol dehydrogenase (Adh), period (per), and Cu/Zn superoxide dismutase (Sod). The ADH enzyme is one of the most abundant proteins in Drosophila, in which it is used by both larvae and adults primarily to catabolize ethanol present in the fermenting fruits in which they develop and/or feed (Sullivan, Atkinson, and Starmer 1990Citation ). per is a pleiotropic gene that plays a major role in essential functions such as male courtship behavior and circadian rhythm (Konopka and Benzer 1971Citation ; Kyriacou and Hall 1989Citation ). Finally, Sod encodes an antioxidant enzyme used in the pathway that converts oxygen radicals into hydrogen peroxide and water (Lindsley and Zimm 1992Citation and references therein). Given the functional significance of the three genes, selective constraints are expected to be high. Conservatively, horizontal transfer was inferred only when the divergence among P sequences was lower than that observed for all host genes.

Phylogenetic Analyses
P-Element Phylogeny
Phylogenetic analyses were done using parsimony and maximum likelihood (ML). Previous analyses had suggested the presence of many most-parsimonious trees (MPTs) for the canonical P-element sequences (Clark et al. 1995Citation ). In order to obtain an exhaustive representation of those, we performed a parsimony search with 5,000 random-addition heuristic search replicates, keeping no more than 50 trees per replicate. The log-likelihood score of all MPTs was determined using the HKY85 model of substitution (Hasegawa, Kishino, and Yano 1985Citation ) with rate heterogeneity between sites and parameters estimated from the data.

A full ML heuristic search is precluded by the large size of the P-element data set. Therefore, trees with the highest likelihood score were obtained by doing heuristic searches on all MPTs with the highest ML scores and on five additional trees among all MPTs, selected for the greatest differences. These five trees were obtained as follows: first, tree-to-tree distances (Waterman and Smith 1978Citation ) were used to select the two most distant trees among all MPTs. Then, the distances of all other MPTs to the first two were calculated. A third tree was then selected from among the ones for which the sum of distances to the first two was highest. The process was repeated to obtain the fourth and fifth trees. The ML heuristic search was performed with the option NOMULPARS in effect, using tree bisection-reconnection (TBR) branch swapping and the HKY85 model of substitution with rate heterogeneity between sites; the parameter values of the model were estimated for each starting tree.

Adh Phylogeny
A parsimony analysis was performed with a heuristic search of 20 random-sequence-addition replicates and TBR branch swapping. An ML analysis was performed with 20 random-addition heuristic search replicates using TBR branch swapping; the HKY85 model of substitution was used with rate heterogeneity between sites and parameters estimated from the data. During the first replicate, a topological constraint was enforced which specified monophyly of both the saltans and willistoni species groups. During the following nine replicates, the parameters used for the model of evolution were the same as those estimated in the first replicate. These parameters were re-estimated on the first tree found in replicate 10, and replicates 11–20 were performed using the re-estimated values for the model parameters. The same topological constraints enforced in replicate 1 were used in replicate 2, but not in any of the following 18 replicates.

Bootstrap analyses were performed for both P and Adh data sets using parsimony. Both analyses consisted of 100 bootstrap replicates, each with 10 random-addition replicates. In the P-element analysis, a maximum of 1,000 trees was saved during each replicate. All phylogenetic analyses were performed in PAUP*, version 4.0d64 (Swofford 1998Citation ).


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
The dN/dS Ratio for Host Genes and P Canonical Elements
The strength of selective constraints on P-element sequences was investigated by comparing the estimated P-element sequence divergence among species groups with that obtained for three Drosophila genes, Adh, per, and Sod.

Estimates for dN were similar for Adh and Sod, with dN = 0.042 ± 0.0053 and 0.042 ± 0.0115, respectively, and slightly higher for per, with dN = 0.059 ± 0.0206 (fig. 1 ). The difference in the standard errors for these estimates reflects the fact that the Sod comparison was based on only two sequences and that in the per comparison, only one very short sequence (198 nt) from the saltans species group was used. Values of dS were approximately one order of magnitude higher than those for dN and were more variable among genes: dS = 0.45 ± 0.051 for Adh, 0.69 ± 0.121 for Sod, and 1.03 ± 0.246 for per (fig. 1 ). These values correspond to dN : dS ratios of 0.09 for Adh and 0.06 for both Sod and per. Differences in the magnitude of dS among genes have been found to be related to the intensity of natural selection on synonymous codon usage (Shields et al. 1988Citation ; Moriyama and Gojobori 1992Citation ). In order to test whether this was the case for the genes under study, we determined whether the magnitude of dS between the saltans and the willistoni species groups for the three host genes was correlated with the degree of codon bias (table 2 ). Adh and Sod sequences were similar to each other in degree of codon bias (Nc = 44 and 45, respectively, and CBI = 0.54 for both) and were more biased than per sequences (Nc = 57 and CBI = 0.29). These indices are correlated (fig. 2 ), so only the results obtained with CBI are presented. Figure 3 shows the relationship between the average CBI in each of the three host genes and the respective dS values between species groups. The relationship observed between codon bias and number of substitutions suggests that selection on codon composition is indeed a major factor in determining the rate of substitution at synonymous sites in these three genes.



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 1.—Mean numbers of substitutions per synonymous site (dS) and substitutions per nonsynonymous site (dN) between the saltans and willistoni species groups in canonical P elements and three host genes, Adh, Sod, and per

 

View this table:
[in this window]
[in a new window]
 
Table 2 Average Codon Bias and G+C Content at Synonymous Sites in the Canonical P Sequences and the Three Host Genes Studied

 


View larger version (18K):
[in this window]
[in a new window]
 
Fig. 2.—Correlation between two indices of codon bias, Nc and CBI, in sequences from three host loci, Adh (crosses; short-dash line; correlation coefficient r = -0.56), per (solid circles; long-dash line; r = -0.79), and Sod (solid triangles; solid line)

 


View larger version (16K):
[in this window]
[in a new window]
 
Fig. 3.—Relation between codon bias index (CBI) and number of substitutions per synonymous site (dS) (r2 = 0.84; NS)

 
The numbers of substitutions per synonymous site and substitutions per nonsynonymous site were estimated for all pairwise comparisons among the 48 canonical P elements. dN was smaller than dS in 1,102 out of the 1,128 (=48 x (47/2)) pairwise comparisons between elements, with a median value of 0.37 for the ratio dN/dS. When elements obtained from the saltans species group were compared with those from the willistoni group, the average values of dN and dS and their standard errors were 0.048 ± 0.0055 and 0.117 ± 0.0204, respectively (fig. 1 ), for a dN/dS ratio of ~0.4. These values are significantly different (t = 3.25, P < 0.01). The dN/dS ratio was also estimated using ML in a phylogenetic context. A set of 27 most likely P-element trees was selected for this analysis (see Phylogenetic Analyses). In the likelihood analyses in which dN/dS was estimated from the data, this ratio varied between 0.72 and 0.73 in all 27 trees. However, these values do not provide a significantly better fit to the data than a ratio of 1 (0.05 < P < 0.1). The differences between the two estimates of dN/dS will be addressed in the discussion. Regardless, these results show that nonsynonymous sites in canonical P elements evolve on average at a rate between one half and one times that of synonymous sites.

The Intensity of Selection in P Elements
The dN : dS ratio for P elements is high when compared with that of host genes (fig. 1 ). This may be due to either weak selection at nonsynonymous sites or very strong selection at synonymous sites. The low to moderate level of codon bias in the sequences of canonical P elements, indicated by an Nc value of 52 and a CBI value of 0.47 (table 2 ), suggests the former. Furthermore, the nature of codon bias in canonical P elements indicates that the bias present is due to mutation bias rather than selection-induced bias, for the reasons that follow. Mutation bias in Drosophila leads to a preferential accumulation of A and T nucleotides (~63%) over C and G (Shields et al. 1988Citation ), leading to an increase in A- and T-ending codons. Conversely, selection-induced codon bias in the Sophophora subgenus produces an excess of G- and especially C-ending codons (Shields et al. 1988Citation ; Starmer and Sullivan 1989Citation ; Akashi 1994Citation ). The proportion of G's and C's at synonymous positions in canonical P elements is approximately 40% (table 2 ). This is very similar to the frequency expected due to genome mutation bias. It is, however, possible that the discrepancy in synonymous-base composition between P elements and host genes could be due to a pattern of mutation or selection characteristic of P elements (Lerat, Biémont, and Capy 2000Citation ). The presence of termination codons in three of the 48 canonical elements and the presence of deletions that unambiguously disrupt the reading frame in eight of those elements further support a scenario of weak selection acting on P elements.

Comparison of dN and dS values between canonical P elements and host genes (fig. 1 ) estimated between species groups reveals that the average dN for P elements is similar to that observed for host genes. Also, the average value of dS observed for P elements is 5 to 10 times as small as the dS observed for host genes. Given our above conclusion of reduced selective constraints acting on P, the smaller value of dS for P elements than for host genes is unexpected. As we argued above, the small number of synonymous substitutions in P is not due to selective pressure on codon usage. Selection-induced codon bias is clearly higher in both Adh and Sod than in the P element. Another possible explanation for the small number of substitutions in P elements is that dS between species groups is larger in host genes because of a radical shift in codon preference between species groups in these genes, but not in P elements. Shifts in codon bias between the subgenera Drosophila and Sophophora, and well as between D. melanogaster and D. willistoni (subgenus Sophophora), have been documented (Starmer and Sullivan 1989Citation ; Anderson, Carew, and Powell 1993Citation ). If associated with strong selection on codon usage in host genes, different codon preferences in the two species groups might be radical enough to cause a large increase in dS. Under this hypothesis, host genes in which codon usage is under stronger selective pressure should have experienced the strongest shift, and a positive correlation between degree of codon bias and number of substitutions (as measured by dS) would be expected. Figure 3 shows that this is not the case. dS is negatively correlated with CBI, therefore providing evidence to reject this second explanation.

Selection Versus Recent Horizontal Transfer
An alternative explanation for the low level of divergence among canonical P sequences in the two species groups is that these elements shared a common ancestor more recently than did their host species. In other words, P elements have been transferred horizontally between species groups more recently than these species groups have shared a common ancestor. In this section, we formulate the predictions that allow us to distinguish between selection and horizontal transfer as possible explanations for the high similarity among all canonical P elements.

Selection (of an ad hoc nature) and recent horizontal transfer have very different consequences both for the level of congruence expected between the P-element and host phylogenies and for the proportionality of P-element divergence relative to host gene divergence, when sister groups of different ages are compared. Namely, if the low P-element divergence resulted from selection and the elements were transmitted vertically, TE and host phylogenies are expected to be congruent. In addition, P-element divergence and host gene divergence should be correlated. If, on the other hand, recent horizontal transfer caused the low P-element divergence, then the P-element phylogeny is not necessarily expected to be congruent with that of the host species. Also, P-element and host gene divergence should not be correlated. When the taxa compared have diverged recently, P-element divergence should be high relative to the divergence of host genes evolving under strong selective constraints. As we compare taxa that are progressively older, P-element divergence should become smaller in relation to the divergence of host genes. These predictions are investigated in the following two sections.

P-Element Versus Host Gene Phylogenies
Host Phylogeny
The phylogenetic relationships within the saltans and willistoni species groups have been studied before (Bicudo 1973Citation ; Throckmorton 1975Citation ; Gleason, Griffith, and Powell 1998Citation ; O'Grady 1998Citation ; O'Grady, Clark, and Kidwell 1998Citation ). We analyzed the phylogeny of the Adh sequences to assess whether the relationships among them reflect those previously determined for the species and to obtain branch length information with which to compare similar information obtained for the P-element sequences. Our parsimony analysis yielded seven MPTs, which were found in all 20 heuristic search replicates. All of these trees had higher likelihood scores than those resulting from the original ML search and were therefore used as the initial trees in another ML search. Three trees resulted from this search, which correspond to the three possible resolutions of the polytomy that includes D. austrosaltans, D. lusaltans, and D. saltans (in all three cases, the internal branch has length 0); the three trees were otherwise identical (fig. 4a ). These results agree with previous hypotheses concerning the relationships in the saltans group (Bicudo 1973Citation ; Throckmorton 1975Citation ; O'Grady, Clark, and Kidwell 1998Citation ), with one exception. The position of the elliptica and neocordata subgroups is unusual in that they are believed to be the most anciently derived subgroups in the saltans group (Throckmorton 1975Citation ). The relationships within the willistoni group are also in agreement with previous analyses of the group (Gleason, Griffith, and Powell 1998Citation ), with two exceptions. The willistoni species group is paraphyletic with respect to the saltans group due to the position of D. nebulosa, and the willistoni subgroup is paraphyletic with respect to D. fumipennis. These discrepancies do not affect our results concerning P-element evolution.



View larger version (39K):
[in this window]
[in a new window]
 
Fig. 4.—Comparison of species and P-element phylogenetic histories. Diagonal lines unite P-element clades with the species from which they were sampled. Thick vertical lines along the species tree indicate taxa whose P elements are not monophyletic. Thick vertical lines along the P-element tree indicate clades that are formed by elements of more than one species. a, Maximum-likelihood tree of the host species based on the Adh data set. Values on the branches represent parsimony bootstrap support. b, One of 24 maximum-likelihood trees for the P-element data set

 
P-Element Phylogeny
The parsimony analysis of the P-element data set yielded 7,979 MPTs, the consensus of which is presented in figure 5a. The log-likelihood score of the MPTs varied between -2,931.968 and -2,913.205. Among these, 22 had the maximum score (-2,913.205). Five other trees, with log-likelihood scores between -2,931.846 and -2,915.590, were chosen to represent the variation across MPTs (see Materials and Methods). ML searches on these 27 trees yielded 24 trees with a log-likelihood score of -2,912.420 and 3 trees with a log-likelihood score of -2,912.708. The majority-rule consensus of these 27 trees is presented in figure 5a. Most nodes that were well represented in the likelihood analysis were also present in the parsimony analysis and correspond exclusively to terminal or next-to-terminal nodes (fig. 5a ).



View larger version (53K):
[in this window]
[in a new window]
 
Fig. 5.—Phylogeny of the canonical P elements. a, Result of the parsimony and maximum-likelihood analyses. Values above the branches represent the 50% majority-rule consensus of 27 trees from the maximum-likelihood analysis. The log-likelihood scores of these trees varied between -2,912.7077 and -2,912.4197. None of these scores differs significantly using a Kishino-Hasegawa test. Indicated below the branches are the 50% majority-rule consensus of 7,979 most-parsimonious trees, which were 419 steps long. b, Results of the bootstrap analysis using parsimony. The tree represents the 50% majority-rule consensus of 168,931 parsimony bootstrap trees

 
Both ML and parsimony analyses, as well as the parsimony bootstrap proportions (fig. 5b ), indicate strong support for 13 small clades of canonical elements which correspond almost exactly to groups of sequences sampled from a single species. In the willistoni species group, the elements from D. tropicalis, D. equinoxialis, and D. nebulosa are monophyletic. In the saltans subgroup, the elements from D. sturtevanti, D. subsaltans, D. lusaltans, D. prosaltans, and D. austrosaltans are also monophyletic. In addition, there are a few instances in which the relationships among the elements mirror those of the host species. The willistoni subgroup elements form a monophyletic group, and so do those from the sister taxa D. capricorni and D. sucinea. The elements from D. paulistorum are paraphyletic in relation to those from D. pavlovskiana. Also, the elements from D. saltans (saltans subgroup) are closer to those from D. subsaltans (parasaltans subgroup) than to the ones from D. sturtevanti (sturtevanti subgroup). Elements from D. austrosaltans and D. fumipennis form the only well-supported clade that has elements from both species groups.

One tree selected arbitrarily among the 24 with the highest log-likelihood score is presented in figure 4b. In the canonical clade, most DNA mutations occur within 13 clades identified in the bootstrap analysis. In the deeper part of the tree, branch lengths are small and the structure of the tree is largely unresolved.

These phylogenetic analyses yield two main conclusions. First, P-element and species phylogenies are not congruent (fig. 4 ). Second, the fact that in the P-element phylogeny so many of the deeper branches (above species level) have length 0 suggests that the transmission of P elements among species occurred rapidly, preventing the accumulation of substitutions in the early stages of the diversification of this P-element subfamily. Both conclusions are fully compatible with the scenario of recent horizontal transfer of P elements into the two Sophophoran species groups but are improbable under a scenario of vertical transmission according to which P elements were present in the common ancestor of all species and evolved under strong selective constraints.

P-Element Versus Host Gene Divergence
P-element and host gene divergences between pairs of taxa were estimated to determine if P elements are at least as divergent as host genes and if those divergences are correlated.

Drosophila willistoni Species Group
Figure 6a shows the results of a comparison of the willistoni subgroup P elements with all others from the willistoni group. dN for P (0.05) lies between the values for host genes (dN = 0.08 for per and 0.04 for Adh). dS, however, is much smaller for P (0.13) than for per (0.62) or Adh (0.54). Similar comparisons were performed across progressively younger nodes of the species tree (see fig. 4a ): D. tropicalis versus its sister clade (fig. 6b ), D. willistoni versus its sister clade (fig. 6c ), D. equinoxialis versus its sister clade (fig. 6d ), and, finally, between D. paulistorum and D. pavlovskiana (fig. 6e ). As expected, host gene divergence decreases with the age of the node. P-element divergence, on the other hand, not only remains approximately constant in the first three comparisons, but also is always smaller than that of host genes (fig. 6bd ). This relationship changes when D. paulistorum and D. pavlovskiana are compared. The divergence between the two pairs of closely related elements (Dpauli13-Dpavlo16 and Dpauli10-Dpavlo21) is now intermediate between that of the two host genes (fig. 6e ). These results are contrary to the expectations under vertical transmission and constant selective pressure operating on these sequences.



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 6.—Mean numbers of substitutions per synonymous site (dS) and substitutions per nonsynonymous site (dN) for canonical P elements, Adh and per, in six pairwise comparisons of sister clades within the willistoni species group

 
The most likely explanation of these results is that the introduction of canonical P elements into the willistoni subgroup took place before the divergence of D. paulistorum and D. pavlovskiana, but after the divergence of these two taxa from D. equinoxialis. This implies at least four horizontal transfers within the willistoni subgroup alone: one into D. tropicalis, one into D. willistoni, one into D. equinoxialis, and one into the lineage leading to D. paulistorum and D. pavlovskiana. Vertical transmission from a common ancestor is sufficient to explain the presence of canonical elements in the two later species.

The same rationale was used to interpret the comparison between D. sucinea and D. capricorni. The most closely related elements in the two species were compared (Dcapri15 and Dsuci1). The divergence of the P elements is intermediate between that of each of the two host genes (fig. 6f ). Therefore, we conclude that P elements could have been transmitted vertically to these two species from their common ancestor. The lineages leading to D. nebulosa and D. fumipennis were present before the introduction of canonical elements into the willistoni group, which, according to our results, took place at the time of the divergence of D. pavlovskiana from D. paulistorum and that of D. capricorni from D. sucinea (fig. 4 ). In summary, the distribution of P elements within the willistoni species group can be explained by seven horizontal transfer events, one into D. nebulosa, one into D. fumipennis, one into the ancestor of D. sucinea and D. capricorni, and four within the willistoni subgroup.

These results rule out the possibility that the monophyly of the P elements from the willistoni subgroup was due to the presence of canonical elements in the ancestor of the subgroup. The observation that some of the crosses between species within the willistoni subgroup produce fertile offspring (reviewed by Bock 1984Citation ) indicates that gene flow between these species could have been present until fairly recently. This suggests introgression as a possible horizontal transfer mechanism for the spread of P elements among the species in the willistoni subgroup, and as the reason why these elements form a monophyletic group.

Drosophila saltans Species Group
Similar analyses were performed on P elements in the saltans group. First, we compared the elements from D. sturtevanti with those from D. saltans and D. subsaltans (fig. 7a ). Then, the two elements from D. subsaltans and Dsalt28 from D. saltans were compared (fig. 7b ). As P elements were less divergent than the host gene in both comparisons, we conclude that P elements have been transmitted horizontally between the three species subgroups in question. We further compared the four species in the saltans subgroup, D. saltans, D. prosaltans, D. austrosaltans, and D. lusaltans (fig. 7c and d ). In both pairwise comparisons performed, P element divergence was higher than that observed for Adh, suggesting that canonical elements might have been present in the saltans subgroup prior to its divergence. In conclusion, at least three instances of horizontal transfer are detectable in the saltans group: one into the sturtevanti subgroup, one into the parasaltans subgroup, and one into the ancestor of the saltans subgroup. Finally, the close similarity of D. austrosaltans elements to two others from D. fumipennis is clearly the result of horizontal transfer between those species (fig. 8 ).



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 7.—Mean numbers of substitutions per synonymous site (dS) and substitutions per nonsynonymous site (dN) for canonical P elements and Adh in four pairwise comparisons of sister clades within the saltans species group

 


View larger version (17K):
[in this window]
[in a new window]
 
Fig. 8.—Mean numbers of substitutions per synonymous site (dS) and substitutions per nonsynonymous site (dN) for canonical P elements and Adh between Drosophila fumipennis (willistoni group) and Drosophila austrosaltans (saltans group)

 
The results of this section strongly suggest that the invasion and subsequent spread of the canonical P subfamily into the saltans and willistoni species groups by horizontal transfer occurred very recently, as most extant species subgroups were already present.

Age of the Most Recent Common Ancestor of Canonical P Elements
We determined the age of the most recent common ancestor of all canonical P elements (MRCA-P) to assess if the age of the subfamily was consistent with the conclusion of the previous section. The age of the MRCA-P was determined in three ways: (1) by calibrating dS between P elements with the rate of synonymous substitution estimated for Drosophila genes with low codon bias, (2) by calibrating dS between P elements with the rate of synonymous substitution in the R1 and R2 TE families, and (3) by examining the relationship between codon bias and dS. The results of these three methods are as follows:

  1. According to the ML analyses, the P elements sampled from D. capricorni and D. sucinea form a clade that is the sister group to the other canonical elements (fig. 6a ). The estimated average number of synonymous substitutions between these two groups of elements was dS = 0.100 ± 0.022. Sharp and Li (1989)Citation estimated the synonymous substitution rate in Drosophila genes with low codon bias to be 1.6% per million years. The age of the MRCA-P could then be estimated to be approximately 3 Myr.
  2. In the R1 and R2 TE families, synonymous sites evolve roughly four times as fast as synonymous sites in host Drosophila Adh (Eickbush et al. 1995Citation ). We estimated dS in Adh between the saltans and the willistoni species groups to be 0.45. If P-element synonymous sites evolved four times as fast, a dS value of 1.8 should be observed. As the dS value observed is 18 times as small as 1.8 (dS = 0.100), the MRCA-P must be approximately 18 times as young as the MRCA of the host species. The divergence between the saltans and the willistoni groups occurred between 15 and 30 MYA (Clark and Kidwell 1997Citation ). The MRCA-P was then estimated to be roughly 1–2 Myr old.
  3. Based on the data available for the three host genes, the relationship between average codon bias, as measured by CBI, and the dS between the saltans and the willistoni species groups can be described by the expression y = 1.6 - 1.9x (fig. 3 ). Assuming that the processes which determine the evolution of the P-element sequences are the same as those affecting the host genes, the average dS of P elements between species groups, given a CBI of 0.47, should be approximately 0.70. As the value of dS observed is seven times as small as that, the MRCA-P should be one seventh the age of the MRCA of the host genome. This provides an estimated age between 2 and 4 Myr. It should be noted that this age is probably overestimated, because the relationship between CBI and dS assumes that the codon bias present is the result of natural selection pressure. As discussed above, the codon bias present in the P sequences seems to be caused by mutation, not selection. Therefore, selection-induced codon bias in P is at most very small, corresponding to an expected value of dS larger than 1. As the observed value of dS is 10 times as small, the MRCA-P must be 10 times as young as the MRCA of the host species and therefore dates to as recently as 1–3 MYA. This last estimate should be taken with caution. The slope of the regression is based on only three points and is not significantly different from 0, given the large standard deviations associated with the values of dS. Nevertheless, this estimate is informative, in that it is in perfect agreement with the other two results.

These three estimates, even though not completely independent (they all depend in part on the number of substitutions in the Adh gene), are very similar to one another. All point to an age for the MRCA-P of approximately 2–3 Myr, an estimate in very good agreement with the conclusion of the previous section. The willistoni subgroup originated between 8 and 12 MYA (Clark and Kidwell 1997Citation ). This date is consistent with the spread of canonical elements through this subgroup by horizontal transfer within the last 1–3 Myr, as most extant species in the subgroup were likely present by the time the MRCA-P invaded these taxa. The saltans subgroup is very young (O'Grady, Clark, and Kidwell 1998Citation ), and so it is conceivable that its diversification could have occurred subsequent to the invasion by the MRCA-P.

Divergence Between P-Element Subfamilies
Autonomous elements, defined as those capable of catalyzing transposition, are known in two P-element subfamilies: the canonical subfamily and that containing the Scaptomyza elements (table 1 ; O'Hare and Rubin 1983Citation ; Simonelig and Anxolabéhère 1991Citation ). We estimated dN and dS between these two subfamilies to determine the strength of the selective constraints in intersubfamily divergence. The values obtained were dN = 0.195 and dS = 1.25 (fig. 9 ), for a dN/dS ratio of 0.16. This ratio is probably overestimated, given that synonymous sites are saturated. This ratio is much smaller than that observed within both the canonical subfamily (0.75 > dN/dS > 0.42) and the subfamily of Scaptomyza elements (dN/dS = 0.26). dN and dS were also estimated between the canonical subfamily and other more divergent subfamilies of P elements present in the saltans and willistoni subgroups. Even though the divergence between these and the canonical subfamily was too large to allow accurate estimation of dS for all pairwise comparisons of elements between subfamilies, the comparisons that were possible suggest that dS is again about one order of magnitude higher than dN (results not shown). These results indicate that the divergence between Pelement subfamilies is characterized by stronger constraints in nonsynonymous sites than those observed in the divergence between closely related elements.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 9.—Mean numbers of substitutions per synonymous site (dS) and substitutions per nonsynonymous site (dN) for canonical P elements and Adh between the two species of Scaptomyza (Drosophilidae) and the species in the saltans and willistoni species groups (genus Drosophila, subgenus Sophophora)

 
Using a rate of synonymous substitution of 1.6% per Myr, the estimated divergence time between Scaptomyza elements and the canonical P element clade is at least 40 Myr (saturation of synonymous sites precludes a more accurate estimate). The existence of an even more divergent P-element sequence, found in the Calliphorid blow fly Lucilia cuprina (Perkins and Howells 1992Citation ), indicates that the P-element family is considerably older than 40 Myr.


    Conclusions
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
In the present study, we investigated the relevance of horizontal transfer and selection in the evolution of the canonical subfamily of P elements. Our results show that while the P-element family as a whole is older than 40 Myr, the diversification of the canonical P subfamily dates back only 2–3 Myr. Since then, canonical P elements have spread into 9 out of 10 species surveyed in the willistoni group and into 6 out of 8 species surveyed in the saltans group. Our results suggest that this spread occurred soon after introduction of the subfamily into these groups and that it was achieved mostly by means of horizontal transfer, with subsequent diversification within species. We estimate that a minimum of 11 horizontal transfer events is necessary to explain the distribution of the canonical subfamily within the saltans and willistoni species groups. This shows not only that horizontal transfer is a frequent occurrence in the P family, but also that a single subfamily is able to spread by this means through two species groups in only a few million years. The high motility of P elements previously remained unnoticed because these horizontal transfer events took place between closely related taxa.

While horizontal transfer clearly plays a major part in the evolution of the canonical subfamily, the role of selection is not as easy to ascertain. Our results show that when canonical P elements are compared among species groups, the dN/dS ratio is approximately 0.45. However, when this ratio is estimated over the entire canonical P-element phylogeny, it increases to 0.75 (and is not significantly different from 0). This suggests that the evolution of P elements in younger branches of the P-element phylogeny, namely among and within species, is characterized by weaker selective constraints in nonsynonymous sites than those in older branches. This was confirmed by estimating the average dN/dS ratio within the 10 clades formed by elements from a single species. dN/dS in these intraspecific comparisons varied between 0.39 and 2.9, revealing a general lack of constraints in nonsynonymous sites. These results contrast sharply with those observed when subfamilies of P elements are compared; the dN/dS ratio among subfamilies is one order of magnitude smaller, revealing strong selective constraints in the evolution of P-element sequences when the elements compared are only distantly related.

Several questions are raised. First, is there a relationship between the high rate of horizontal transfer observed for the canonical subfamily and the "patchy" distribution of P elements in the saltans and willistoni species groups? Second, what is the origin of the canonical P elements, given that their presence in the New World Sophophora groups is so recent? Third, why do selective constraints, as reflected in the dN/dS ratio, vary through different levels of P-element divergence? And finally, how do the scenarios of frequent horizontal transfer and low selective constraints fit into the life cycle of the P family? These questions will be addressed in the following sections.

Horizontal Transfer and the Distribution of P Elements
The distribution of P elements in the subgenus Sophophora is not homogeneous, but "patchy"; while some species lack P elements altogether, others seem to host multiple P subfamilies (Clark and Kidwell 1997Citation ).

Within the willistoni species group, D. insularis, a species restricted to a few islands in the Lesser Antilles (Spassky et al. 1971Citation ), has no canonical P elements (Daniels and Strausbaugh 1986Citation ; Clark et al. 1995Citation ). Daniels et al. (1990)Citation proposed that this could be due to the spread of canonical P elements by horizontal transfer after the geographical isolation of D. insularis was established. Alternatively, Clark et al. (1995)Citation proposed that these elements could have been lost following the geographic isolation of this species. Our results provide support for the first hypothesis. Drosophila insularis is the oldest species in the willistoni subgroup (Gleason, Griffith, and Powell 1998Citation ), which is 8–12 Myr old (Clark and Kidwell 1997Citation ). Hence, it seems more plausible that this species was already formed when the canonical P element spread took place 2–3 MYA and that its geographical isolation shielded the species from invasion.

In the elliptica and cordata subgroups, represented here by D. emarginata and D. neocordata, all species surveyed are devoid of any discernible P elements (Daniels et al. 1990Citation ; Clark et al. 1995Citation ). These two subgroups, thought to be the most anciently derived within the saltans group, are believed to have diversified in North America (Throckmorton 1975Citation ). Meanwhile, and prior to the formation of the Isthmus of Panama, an ancestor of the sturtevanti, parasaltans, and saltans subgroups crossed to South America, where these subgroups later diversified (Throckmorton 1975Citation ). Daniels et al. (1990)Citation suggested that P elements could have spread throughout the saltans and willistoni groups in South America, at a time when the elliptica and cordata subgroups were still geographically isolated in North America. Alternatively, Clark et al. (1995)Citation proposed that P elements could have been lost from the elliptica and cordata subgroups early in the history of the saltans group, when those two subgroups were isolated from the lineages leading to the more recently derived saltans subgroups. However, we now know that the spread of canonical P elements within the group took place within the last 3 Myr, at a time when North and South America were no longer disconnected. The current distributions of the elliptica and cordata subgroups through Central and South America overlap those of the other saltans and willistoni subgroups (Patterson and Mainland 1944Citation ; Magalhães 1962Citation ; Spassky et al. 1971Citation ). Therefore, the absence of canonical P elements in the elliptica and cordata subgroups does not seem to be due to their geographical isolation. An alternative hypothesis is that there might be host factors in the elliptica and cordata lineages that confer resistance to P-element invasion. For example, recent simulation studies show that the inability of a species to cope with the type of mutations induced by transposition could prevent P-element invasion (Quesneville and Anxolabéhère 1997Citation ). This hypothesis, which can explain the complete absence of P-homologous sequences in these species, could potentially be tested with transformation studies.

Our analyses suggest that the patchy taxonomic distribution of canonical P elements is due not to the loss of vertically transmitted elements, but to their spread by horizontal transfer. A few other instances of horizontal transfer involving P elements have already been documented (Clark, Maddison, and Kidwell 1994Citation ; Hagemann, Haring, and Pinsker 1996Citation ; Clark and Kidwell 1997Citation ; Loreto et al. 1998Citation ), which suggests that this phenomenon is present in other taxa carrying P elements. Also, preliminary analyses of additional P subfamilies present in the saltans and willistoni groups indicate that horizontal transfer is common in other P subfamilies (unpublished data). Detailed analyses, similar to the ones performed here for the canonical elements, of all P subfamilies could be done to ascertain if the history of the canonical subfamily is representative of that of the whole P family.

Where Did Canonical P Elements Come From?
This question has been previously asked, in the context of the recent introduction of canonical P elements into D. melanogaster (Engels 1989Citation ). The same question resurfaces now that we know that the invasion of the willistoni and saltans groups by canonical P elements, even though earlier than that of D. melanogaster, was still fairly recent. A detailed phylogenetic analysis of P-element sequences in the subgenus Sophophora places the clade of canonical elements as the sister group to all other subfamilies of Drosophila and Scaptomyza P elements (Clark and Kidwell 1997Citation ). In addition to the canonical subfamily, that study identified three other P subfamilies in species of the saltans and willistoni groups. The four subfamilies form distantly related clades, as indicated by mutational saturation at third codon positions when elements in different subfamilies are compared (unpublished data). However, within P subfamilies, elements sampled from different species, and even species groups, are often very closely related (unpublished data). The discreteness of these subfamilies coexisting in the same genome provides a strong indication that the presence of multiple P subfamilies in the saltans and willistoni species groups is the result of multiple waves of horizontal transfer events. This further emphasizes the relevance of this mode of transmission in the evolution of the P TE family. The source of these waves of horizontal transfer is unknown. It is possible that lineages leading to each subfamily were present but dormant in a New World sophophoran species and at some point spread through the two species groups. Alternatively, a reservoir may exist in another Drosophila species or perhaps even in a different taxonomic group. This could potentially be investigated by PCR analysis using pools of DNA from multiple species. These species could be sampled among the species that have easy access to rotting fruits in which the eggs of these Drosophila species are laid, such as soil organisms, nematodes, or other fruit-feeding dipterans.

Selection in the P Family of TEs
Five mechanisms through which selection can be effective on TE sequences have recently been summarized (Witherspoon 1999Citation ): selection on TEs is expected if (1) transposition increases host fitness, (2) transcripts from functional TEs increase host fitness, (3) functional elements transpose at a higher frequency within a genome, (4) functional elements transpose at a higher frequency within a population, and/or (5) only functional elements can spread by horizontal transfer. According to the first four mechanisms, one would expect selection to be effective on TE intraspecific evolution. However, if selection operates only at the time of horizontal transfer, as described by the fifth mechanism, selective constraints should be detected only when elements from different species are compared, as horizontal transfer between taxa creates a sieve through which functional elements pass more easily than nonfunctional elements.

Our results show that the dN/dS ratio in intraspecific P-element comparisons does not differ significantly from 1 but that this ratio decreases for interspecific P comparisons (0.4 < dN/dS < 1), providing strong support for the fifth selection mechanism. The fact that it is even smaller (dN/dS < 0.2) in intersubfamily P comparisons is also easily explained: lineages leading to distantly related elements have survived for longer periods of evolutionary time, with the concurrent accrual of mutations at synonymous sites. These elements would likely have survived through further rounds of horizontal transfer, which would have imposed selective constraints in nonsynonymous positions. Consequently, the dN/dS ratio among elements of different subfamilies is expected to be lower than that observed for comparisons among the closely related elements that compose a subfamily.

Long-Term Maintenance, Horizontal Transfer, and the Life Cycle of P Elements
The life cycle of P elements is similar to that described for members of other class II TE families, such as the mariner-like elements (Engels 1989Citation ; Hartl et al. 1997Citation ). This cycle starts with the invasion of a naive host, a rapid increase in copy number, the spread through the host population, and the eventual stabilization of copy number once repression of transposition arises. The frequency of functional elements then decreases, as functional copies are lost through mutation, excision, drift, and selection. Extinction is the eventual fate of TE systems such as P, in which the transposition of full-length, functional elements can give rise to both functional and nonfunctional elements; nonfunctional elements can also transpose, and transposition is a negative function of copy number (Kaplan, Darden, and Langley 1985Citation ). Horizontal transfer of a functional element into a naive host, with its concomitant high rates of unrepressed transposition, provides the source for a large number of functional copies and a chance to escape extinction (Lohe et al. 1995Citation ; Robertson and Lampe 1995Citation ). Overall, our results, together with those of Witherspoon (1999)Citation , provide strong empirical evidence that horizontal transfer is the main, and maybe the unique, source of selective constraint on the P-element transposase sequence as a whole, at least in the hosts surveyed to date. Whether this selection mechanism is sufficient to have maintained the P-element family for over 40 Myr is unknown. The existence of a reservoir species is possible in which selection acts during intraspecific evolution of P elements. Simulation studies are needed to address this question from a theoretical standpoint. At the same time, surveys of the genomes of species that share the habitat of sophophoran drosophilids are needed to assay for the presence of canonical P elements.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
The Adh sequences for the Scaptomyza species have the following accession numbers: Scaptomyza pallida—AF264071; Scaptomyza elmoi—AF264072.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
We thank Jon Clark, Alan de Queiroz, John Gatesy, Michael Nachman, and Patrick O'Grady for discussions and comments on previous versions of this manuscript, John Hulsenbeck and David Maddison for methodological advice, and Junta Nacional de Investigação Científica e Tecnológica (JNICT) and the Research Training Grant for the Analysis of Biological Diversification at the University of Arizona for fellowship support to J.C.S. This work was supported by National Science Foundation (NSF) grants DEB-9701252 and DEB-9815754.


    Footnotes
 
Pierre Capy, Reviewing Editor

1 Abbreviations: ML, maximum likelihood; MPT, most-parsimonious tree; MRCA-P, most recent common ancestor of canonical P elements; ORF, open reading frame; TBR, tree bisection-reconnection; TE, transposable element. Back

2 Keywords: P element horizontal transfer selection on transposable elements Drosophila Sophophora Back

3 Address for correspondence and reprints: Joana Silva, Genetics Program, BioSciences West #310, University of Arizona, Tucson, Arizona 85721. E-mail: joana{at}u.arizona.edu Back


    literature cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 

    Akashi, H. 1994. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136:927–935

    Anderson, C., E. A. Carew, and J. R. Powell. 1993. Evolution of the Adh locus in the Drosophila willistoni group: the loss of an intron, and shift in codon usage. Mol. Biol. Evol. 10:605–618[Abstract]

    Berg, D. E., and M. M. Howe. 1989. Mobile DNA. Pp. xi–xii in D. E. Berg and M. M. Howe, eds. American Society for Microbiology, Washington, D.C

    Bicudo, H. E. M. C. 1973. Chromosomal polymorphism on the saltans group of Drosophila. I. The saltans subgroup. Genetica 44:520–552

    Bock, I. R., ed. 1984. Interspecific hybridization in the genus Drosophila. Pp. 41–70 in M. K. Hecht, B. Wallace, and G. T. Prance, eds. Evolutionary biology. Vol. 18. Plenum Press, New York

    Capy, P., C. Bazin, D. Higuet, and T. Langin. 1997. Dynamics and evolution of transposable elements. Landes Bioscience, Austin, Tex

    Citri, Y., H. V. Colot, A. C. Jacquier, Q. Yu, J. C. Hall, D. Baltimore, and M. Rosbash. 1987. A family of unusually spliced biologically active transcripts encoded by a Drosophila clock gene. Nature 326:42–47

    Clark, J. B., T. K. Altheide, M. J. Schlosser, and M. G. Kidwell. 1995. Molecular evolution of P transposable elements in the genus Drosophila I. The saltans and willistoni species groups. Mol. Biol. Evol. 12:902–913[Abstract]

    Clark, J. B., and M. G. Kidwell. 1997. A phylogenetic perspective in P transposable element evolution in Drosophila. Proc. Natl. Acad. Sci. USA 94:11428–11433

    Clark, J. B., W. P. Maddison, and M. G. Kidwell. 1994. Phylogenetic analysis supports horizontal transfer of P transposable elements. Mol. Biol. Evol. 11:40–50[Abstract]

    Daniels, S. B., K. R. Peterson, L. D. Strausbaugh, and M. G. Kidwell. 1990. Evidence for horizontal transmission of the P transposable element between Drosophila species. Genetics 124:399–355

    Daniels, S. B., and L. D. Strausbaugh. 1986. The distribution of P-element sequences in Drosophila: the willistoni and saltans species groups. J. Mol. Evol. 23:138–148[ISI][Medline]

    Eickbush, D. G., W. C. Lathe III, M. P. Francino, and T. H. Eickbush. 1995. R1 and R2 retrotransposable elements of Drosophila evolve at rates similar to those of nuclear genes. Genetics 139:685–695

    Engels, W. R. 1989. P elements in D. melanogaster. Pp. 437–483 in D. E. Berg and M. M. Howe, eds. American Society for Microbiology, Washington, D.C

    Finnegan, D. J. 1992. Transposable elements. Curr. Opin. Genet. Dev. 2:861–867[Medline]

    Flavell, A. J. 1999. Long terminal repeat retrotransposons jump between species. Proc. Natl. Acad. Sci. USA 96:12211–12212

    Gleason, J. M., E. C. Griffith, and J. R. Powell. 1998. A molecular phylogeny of the Drosophila willistoni group: conflicts between species concepts? Evolution 52:1093–1103

    Gleason, J. M., and J. R. Powell. 1997. Interspecific and intraspecific comparisons of the period locus in the Drosophila willistoni sibling species. Mol. Biol. Evol. 14:741–753[Abstract]

    Hagemann, S., E. Haring, and W. Pinsker. 1996. Repeated horizontal transfer of P transposons between Scaptomyza pallida and Drosophila bifasciata. Genetica 98:43–51

    Hagemann, S., W. J. Miller, and W. Pinsker. 1994. Two distinct P element subfamilies in the genome of D. bifasciata. Mol. Gen. Genet. 244:168–175[ISI][Medline]

    Hartl, D. L., E. R. Lozovskaya, D. I. Nurminsky, and A. R. Lohe. 1997. What restricts the activity of mariner-like transposable elements? Trends Genet. 13:197–201

    Hasegawa, M., H. Kishino, and T. Yano. 1985. Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160–174[ISI][Medline]

    Jordan, I. K., L. V. Matyunina, and J. F. McDonald. 1999. Evidence for the recent horizontal transfer of long terminal repeat retrotransposon. Proc. Natl. Acad. Sci. USA 96:12621–12625

    Kaplan, N., T. Darden, and C. Langley. 1985. Evolution and extinction of transposable elements in Mendelian populations. Genetics 109:459–480

    Konopka, R. J., and S. Benzer. 1971. Clock mutants of Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 68:2112–2116

    Kwiatowski, J., A. Latore, D. Skarecky, and F. J. Ayala. 1994a. Characterization of a Cu/Zn superoxide dismutase-encoding gene region in Drosophila willistoni. Gene 147:295–296

    Kwiatowski, J., D. Skarecky, K. Bailey, and F. J. Ayala. 1994b. Phylogeny of Drosophila and related genera inferred from nucleotide sequence of the Cu/Zn Sod gene. J. Mol. Evol. 38:443–454

    Kyriacou, C. P., and J. C. Hall. 1989. Spectral analysis of Drosophila courtship song rhythms. Anim. Behav. 37:850–859[ISI]

    Lansman, R. A., R. O. O'Shade, T. A. Grigliatti, and H. W. Brock. 1987. Evolution of P transposable elements: sequences of Drosophila nebulosa P elements. Proc. Natl. Acad. Sci. USA 84:6491–6495

    Lerat, E., C. Biémont, and P. Capy. 2000. Codon usage and the origin of P elements. Mol. Biol. Evol. 17:467–468[Free Full Text]

    Lindsley, D. L., and G. G. Zimm. 1992. The genome of Drosophila melanogaster. Academic Press, San Diego, Calif

    Lohe, A. R., E. N. Moriyama, D.-A. Lidholm, and D. L. Hartl. 1995. Horizontal transmission, vertical inactivation, and stochastic loss of mariner-like transposable elements. Mol. Biol. Evol. 12:62–72[Abstract]

    Loreto, E. L. S., L. Basso da Silva, A. Zaha, and V. L. S. Valente. 1998. Distribution of transposable elements in the neotropical species of Drosophila. Genetica 101:153–165

    McAllister, B. F., and J. H. Werren. 1997. Phylogenetic analyses of a retrotransposon with implications for strong evolutionary constraints on reverse transcriptase. Mol. Biol. Evol. 14:69–80[Abstract]

    Magalhães, L. E. 1962. Notes on taxonomy, morphology and distribution of the saltans group of Drosophila, with descriptions of four new species. Univ. Tex. Publ. 6205:135–154

    Misra, S., and D. C. Rio. 1990. Cytotype control of Drosophila P element transposition: the 66kd protein is a repressor of transposase activity. Cell 62:269–284

    Montchamp-Moreau, C., S. Ronsseray, M. Jacques, M. Lehmann, and D. Anxolabéhère. 1993. Distribution and conservation of sequences homologous to the 1731 retrotransposon in Drosophila. Mol. Biol. Evol. 10:791–803[Abstract]

    Moriyama, E. N., and T. Gojobori. 1992. Rates of synonymous substitutions and base composition of nuclear genes in Drosophila. Genetics 130:855–864

    Morton, B. R. 1993. Chloroplast DNA codon use: evidence for selection at the psb A locus based on tRNA availability. J. Mol. Evol. 37:273–280[ISI][Medline]

    Nei, M., and T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418–426[Abstract]

    Nei, M., and L. Jin. 1989. Variances of the average numbers of nucleotide substitutions within and between populations. Mol. Biol. Evol. 6:290–300[Abstract]

    O'Grady, P. M. 1998. Phylogenetic relationships of flies in the family Drosophilidae inferred by combined analysis of molecular and morphological data sets. Doctoral dissertation, University of Arizona, Tucson

    O'Grady, P. M., J. B. Clark, and M. G. Kidwell. 1998. Phylogeny of the Drosophila saltans species group based on combined analysis of nuclear and mitochondrial DNA sequences. Mol. Biol. Evol. 15:656–664[Abstract]

    O'Hare, K., and G. M. Rubin. 1983. Structures of P transposable elements and their sites of insertion and excision in the Drosophila melanogaster genome. Cell 34:25–35

    Patterson, J. T., and G. B. Mainland. 1944. The Drosophilidae of Mexico. Univ. Tex. Publ. 4445:1–101

    Peixoto, A. A., S. Campesan, C. Rodolfo, and C. P. Kyriacou. 1993. Molecular evolution of a repetitive region within the per gene of Drosophila. Mol. Biol. Evol. 10:127–139[Abstract]

    Perkins, H. D., and A. J. Howells. 1992. Genomic sequences with homology to the P element of Drosophila melanogaster occur in the blowfly Lucilia cuprina. Proc. Natl. Acad. Sci. USA 89:10753–10757

    Quesneville, H., and D. Anxolabéhère. 1997. A simulation of P element horizontal transfer in Drosophila. Genetica 100:295–307

    Robertson, H. M., and D. J. Lampe. 1995. Recent horizontal transfer of a mariner transposable element among and between Diptera and Neuroptera. Mol. Biol. Evol. 12:850–862[Abstract]

    Robertson, H. M., F. N. Soto-Adames, K. O. K. Walden, R. M. P. Avancini, and D. J. Lampe. 1998. The mariner transposons of animals: horizontally jumping genes. Pp. 268–284 in M. Syvanen and C. I. Kado, eds. Horizontal gene transfer. Chapman and Hall, New York

    SanMiguel, P., A. Tikhonov, Y. K. Jin et al. (11 co-authors). 1996. Nested retrotransposons in the intergenic regions of the maize genome. Science 274:765–768

    Sharp, P. M., and W. H. Li. 1989. On the rate of DNA sequence evolution in Drosophila. J. Mol. Evol. 28:398–402[ISI][Medline]

    Shields, D. C., P. M. Sharp, D. G. Higgins, and F. Wright. 1988. Silent sites in Drosophila genes are not neutral: evidence of selection among synonymous codons. Mol. Biol. Evol. 5:704–716[Abstract]

    Simonelig, M., and D. Anxolabéhère. 1991. A P element of Scaptomyza pallida is active in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 88:6102–6106

    Spassky, B. S., R. C. Richmond, S. Peréz-Salas, O. Pavlovsky, C. A. Mourão, A. S. Hunter, H. Hoenigsberg, T. Dobzhansky, and F. J. Ayala. 1971. Geography of sibling species related to Drosophila willistoni and semispecies of the Drosophila paulistorum complex. Evolution 25:129–143

    Stacey, S. N., R. A. Lansman, H. W. Brock, and T. A. Grigliatti. 1986. Distribution and conservation of mobile elements in the genus Drosophila. Mol. Biol. Evol. 3:522–534[Abstract]

    Starmer, W. T., and D. T. Sullivan. 1989. A shift in third-codon-position nucleotide frequency in alcohol dehydrogenase genes in the genus Drosophila. Mol. Biol. Evol. 6:544–552

    Sullivan, D. T., P. W. Atkinson, and W. T. Starmer. 1990. Molecular evolution of the alcohol dehydrogenase genes in the genus Drosophila. Evol. Biol. 24:107–147

    Swofford, D. 1998. PAUP*: phylogenetic analysis using parsimony. Version 4.064. Sinauer, Sunderland, Mass

    Throckmorton, L. H. 1975. The phylogeny, ecology and geography of Drosophila. Pp. 421–469 in R. C. King, ed. Handbook of genetics. Plenum Press, New York

    Waterman, M. S., and T. F. Smith. 1978. On the similarity of dendrograms. J. Theor. Biol. 73:789–800[ISI][Medline]

    Witherspoon, D. J. 1999. Selective constraints on P-element evolution. Mol. Biol. Evol. 16:472–478[Abstract]

    Witherspoon, D. J., T. G. Doak, K. R. Williams, A. Seegmiller, J. Seger, and G. Herrick. 1997. Selection on the protein-coding genes of the TBE1 family of transposable elements in the ciliates Oxytricha fallax and O. trifallax. Mol. Biol. Evol. 14:696–706[Abstract]

    Wright, F. 1990. The ‘effective number of codons' used in a gene. Gene 87:23–29

    Yang, Z. 1998. Phylogenetic analysis by maximum likelihood (PAML). Version 1.4. University College London, London, England

    Yang, Z., N. Goldman, and A. Friday. 1994. Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. Mol. Biol. Evol. 11:316–324[Abstract]

Accepted for publication June 14, 2000.