Institute of Medical Biology, University of Vienna, Vienna, Austria
P elements are DNA transposons that were first discovered to be the causative agent of hybrid dysgenesis in Drosophila melanogaster (Kidwell, Kidwell, and Sved 1977
) but were later found to occur in many drosophilid species. The interspecific distribution of P-element sequences, as well as their sequence relationships, is not in accordance with the phylogeny of their host species. Therefore, P elements are not merely vertically inherited but can also be transmitted horizontally between sexually isolated taxa. The most striking example is the rather recent transfer from Drosophila willistoni to D. melanogaster (Daniels et al. 1990
), which must have occurred in the last century and was followed by a rapid spread of P elements through the natural populations of this new host (Anxolabéhère, Kidwell, and Periquet 1988
). Additional cases of horizontal transmission show that P elements have repeatedly crossed species barriers and have even invaded species of the related genera Scaptomyza and Lordiphosa (Hagemann, Haring, and Pinsker 1996a
; Clark and Kidwell 1997
; Haring, Hagemann, and Pinsker 2000
; Silva and Kidwell 2000
). However, due to the fact that their mode of transposition requires host-encoded proteins (Rio and Rubin 1988
), the distribution of active P elements seemed to be restricted to fruit flies of the drosophilid family. Thanks to the data provided by the human genome project, we are able to show that a homolog of the P element coding sequence exists as a stationary single-copy sequence in the human genome.
The molecular structure of actively transposing Drosophila P elements consists of four exons (designated 03) flanked by terminal inverted repeats (O'Hare and Rubin 1983
). The exons code for two different proteins produced by differential splicing of the primary transcript (Misra and Rio 1990
). The transposase contains the information of all four exons and mediates transposition in germ line cells. A second protein, which is translated from an mRNA that retains the third intron, acts as a repressor of P-element transposition in somatic cells. Besides these full-sized P elements, terminally truncated P element homologs have been detected in two different Drosophila lineages (Miller et al. 1992
; Nouaud and Anxolabéhère 1997
). In both cases, the terminal inverted repeats are missing and the coding region lacks the transposase-specific exon 3. It is assumed that these truncated P homologs are derivatives of previously active transposons that were primarily maintained in their host genomes as suppressors of transposition but later acquired new functions in their host genomes (Miller et al. 1999
). P-related sequences without terminal inverted repeats were also found outside the drosophilid family in the Australian sheep blowfly Lucilia cuprina (Perkins and Howells 1992
) and in the house fly Musca domestica (Lee, Clark, and Kidwell 1999
). These immobile P homologs may represent either truncated derivatives of once transpositionally active P elements or, alternatively, descendants of an ancestral genomic progenitor sequence that later evolved into mobile P elements by acquisition of the terminal structures essential for transposition.
The P homolog in the human genome (Phsa) was discovered by a BLAST search of the GenBank database carried out with a 1,690-bp cDNA sequence derived from a recently isolated 3'-truncated P element of Drosophila subsilvestris (unpublished data) which belongs to the T-type subfamily of P transposons (Hagemann, Haring, and Pinsker 1996b
). The search revealed significant amino acid similarity of the deduced Drosophila protein sequence to a human protein of unknown function (accession number BAB15609), with 23% identities and 40% similarities over 407 amino acids (BLAST E value score: 9 x 10-7). The corresponding cDNA sequence of this human protein was originally described in the course of the NEDO human cDNA sequencing project (accession number AK026973). Partial sequences are registered in the human EST database (AA443424, AI219532, AA659374, AA194210, AA194021). Through the cDNA sequence, we were able to identify the entire gene in the recently released complete human genome sequence (position NT_006413.2/Hs 4_6570 on the long arm of chromosome 4). The exon/intron limits were subsequently deduced from the alignment of the genomic sequence with the cDNA. Alignments with several insect P-element sequences suggest that the coding region of Phsa extends farther upstream of the previously presumed start codon.
In figure 1a
, the molecular structure of Phsa is compared with that of the full-sized canonical P element (p25.1) of D. melanogaster. The coding region of Phsa includes equivalents of exons 13 but lacks nearly the whole exon 0 and the terminal structures. No inverted repeats are found in the flanking regions over a distance of 1 kb on both sides. The presumed start codon is located close to the 3' end of exon 0 (ed0) of the Drosophila sequence. Phsa contains 3 exons (eh1eh3) and 2 introns (ih1 and ih2). The introns of the Drosophila P element (id1id3) are missing in Phsa. The large exon eh3 is homologous to the 3' section of ed1 and the complete sequences of ed2 and ed3. Two introns (ih1, ih2) not found in the Drosophila sequence are located within the section corresponding to ed1. The larger intron ih2 has a length of 9,008 bp and contains insertions of six mobile sequences (five Alu elements and one LINE-1 element). The reading frames are intact and encode a protein of 759 amino acids. The occurrence as a single-copy sequence (at least in the euchromatic part of the human genome), the absence of the characteristic inverted repeat termini, and the length of the sequence (12.4 kb) suggest that Phsa is not transpositionally active. In figure 1b
, the deduced protein sequences of Phsa and the D. melanogaster P element (p
25.1) are compared. Sequence similarity is highest in the central section of ed2. One of the conserved motifs (ATQLFS) is found not only in Phsa and p
25.1, but also in all Drosophila P-related sequences and, with one replacement, in the P homologs of L. cuprina and M. domestica. The section corresponding to ed3 shows the strongest divergence (amino acid similarity: 22.1% in ed1, 27.7% in ed2, and 13.6% in ed3) and differs in length by 56 amino acids. This is not surprising, as exon 3, which is required for transposase function only, is the most variable section among the Drosophila P elements (Witherspoon 1999
). Nevertheless, the presence of two conserved motifs, AGYV++KL and GL++PSE, suggests that this section too is homologous to P elements.
|
|
Acknowledgements
This work was supported by the Austrian Science Foundation (FWF, project P11819-GEN).
Footnotes
1 Keywords: P element
Drosophila
human genome
sequence phylogeny
2 Address for correspondence and reprints: Sylvia Hagemann, Institute of Medical Biology, University of Vienna, Währingerstrasse 10, A-1090 Vienna, Austria. sylvia.hagemann{at}univie.ac.at
.
References
Anxolabéhère D., M. G. Kidwell, G. Periquet, 1988 Molecular characteristics of diverse populations are consistent with the hypothesis of a recent invasion of Drosophila melanogaster by mobile P elements Mol. Biol. Evol 5:252-269[Abstract]
Clark J. B., M. G. Kidwell, 1997 A phylogenetic perspective on P transposable element evolution in Drosophila Proc. Natl. Acad. Sci. USA 94:11428-11433
Daniels S. B., K. R. Peterson, L. D. Strausbaugh, M. G. Kidwell, A. Chovnick, 1990 Evidence for horizontal transmission of the P transposable element between Drosophila species Genetics 124:339-355
Hagemann S., E. Haring, W. Pinsker, 1996a Repeated horizontal transfer of P transposons between Scaptomyza pallida and Drosophila bifasciata Genetica 98:43-51[ISI][Medline]
. 1996b A new P element subfamily from Drosophila tristis, D. ambigua, and D. obscura Genome 39:978-985[ISI][Medline]
Haring E., S. Hagemann, W. Pinsker, 2000 Ancient and recent horizontal invasions of drosophilids by P elements J. Mol. Evol 51:577-586[ISI][Medline]
International Human Genome Sequencing Consortium 2001 Initial sequencing and analysis of the human genome Nature 409:860-921[ISI][Medline]
Kidwell M. G., J. F. Kidwell, J. A. Sved, 1977 Hybrid dysgenesis in Drosophila melanogaster: a syndrome of aberrant traits including mutation, sterility and male recombination Genetics 86:813-833
Kidwell M. G., D. R. Lisch, 2001 Perspective: transposable elements, parasitic DNA, and genome evolution Evolution 55:1-24[ISI][Medline]
Lee S. H., J. B. Clark, M. G. Kidwell, 1999 A P element-homologous sequence in the house fly, Musca domestica Insect Mol. Biol 8:491-500[ISI][Medline]
Miller W. J., S. Hagemann, E. Reiter, W. Pinsker, 1992 P homologous sequences are tandemly repeated in the genome of Drosophila guanche Proc. Natl. Acad. Sci. USA 89:4018-4022[Abstract]
Miller W. J., J. F. McDonald, D. Nouaud, D. Anxolabéhère, 1999 Molecular domesticationmore than a sporadic episode in evolution? Genetica 107:197-207[ISI][Medline]
Misra S., D. C. Rio, 1990 Cytotype control of P element transposition: the 66 kd protein is a repressor of transposase activity Cell 62:269-284[ISI][Medline]
Nouaud D., D. Anxolabéhère, 1997 P element domestication: a stationary truncated P element may encode a 66-kDa repressor-like protein in the Drosophila montium species subgroup Mol. Biol. Evol 14:1132-1144[Abstract]
O'Hare K., G. M. Rubin, 1983 Structures of P transposable elements and their sites of insertion and excision in the Drosophila melanogaster genome Cell 34:25-35[ISI][Medline]
Perkins H. D., A. J. Howells, 1992 Genomic sequences with homology to the P element of Drosophila melanogaster occur in the blowfly Lucilia cuprina Proc. Natl. Acad. Sci. USA 89:10753-10757[Abstract]
Rio D. C., G. M. Rubin, 1988 Identification and purification of a Drosophila protein that binds to the terminal 31-base-pair inverted repeats of the P transposable element Proc. Natl. Acad. Sci. USA 85:8929-8933[Abstract]
Silva J. C., M. G. Kidwell, 2000 Horizontal transfer and selection in the evolution of P elements Mol. Biol. Evol 17:1542-1557
Swofford D., 1997 PAUP: phylogenetic analysis using parsimony. Version 4.0.0d Smithsonian Institution, Washington, D.C
Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 25:4876-4882
Witherspoon D. J., 1999 Selective constraints on P-element evolution Mol. Biol. Evol 16:472-478[Abstract]