Departamento de Genética and Instituto Cavanilles de Biodiversidad y Biología Evolutiva, Universidad de Valencia, Spain
Ty3/Gypsy long-terminal-repeat (LTR) retrotransposons are among the best-known transposable elements. They inhabit the genomes of many eukaryotic organisms, such as slime molds, plants, fungi, and animals, including vertebrates (Xiong and Eickbush 1990
; Malik and Eickbush 1999
; Miller et al. 1999
; Marín and Lloréns 2000
). However, in spite of extensive genomic information, these elements had never been found in mammals. In the process of building a database of integrase domain (IN) sequences, we found an intriguing human sequence very similar to the IN of Ty3/Gypsy elements. It was particularly similar to the IN of the Drosophila melanogaster 412 element (E value = 10-27). The sequence of the human gene, which we called Gypsy integrase-1, or Gin-1, was reconstructed by combining information from genomic and cDNA sequences present in the National Center for Biotechnology Information databases (online at http://www.ncbi.nlm.nih.gov/; sequences in TIGR and Sanger Center databases did not provide additional information). Partial mouse, rat, and cow orthologous cDNAs were also detected. Moreover, an apparently full-length mouse cDNA sequence (accession number AK015243) was also found. However, the corresponding genomic sequences are not yet available for any of these other mammalian species.
Figure 1
summarizes the structure and describes the protein encoded by the human Gin-1 gene. It has the characteristic H2C2, DDE, and GPY/F motifs found in many retroviral and retrotransposon integrases (Khan et al. 1991
; Malik and Eickbush 1999
). Homology to IN of Ty3/Gypsy elements spans the whole protein sequence (fig. 2
), strongly suggesting that GIN-1 is also, and exclusively, an integrase. The similarity between the human and mouse genes is the expected similarity for orthologous genes of these two species. Amino acidic identity is 446/552 = 85%, while a comparative analysis of 1,138 mouse/human orthologs estimated an average identity in their coding regions of 86.4% (Makalowski and Boguski 1998
).
|
|
|
Gin-1 is widely expressed. The developmental stages or tissues from which cDNA libraries were obtained that included at least one human Gin-1 clone were as follows: 12-week embryo (accession number AA328555), kidney (AI631948), brain (H23481), skeletal muscle (BF789980), and placenta (BE929727). Gin-1 is also expressed in different human tumors, originating from the parathyroid gland (W39050), the colon (AA574153), the stomach (AI933750), the bladder (BE566969), the uterus (AW572091), and the prostate (AA804922). Mouse Gin-1 cDNAs were found in libraries from embryos (AA162090), the testis (AK015243), the retina (BF463804), the lymph node (AA190067), the thymus (BF658153), the mammary gland (AA542272), and the spleen (AI639767). Finally, a rat Gin-1 sequence came from an ovary cDNA library (AW144198). The only cow sequence available so far (BF774149) comes from a library of mixed origin.
Some other mammalian genes seem to have derived from transposable elements or retroviruses (reviewed in Smit 1999
; International Human Genome Sequencing Consortium 2001
). This is, however, the first description of a host gene derived from an LTR retrotransposon protein domain. In a recent work, Volff, Körting, and Schartl (2001)
proposed the existence of another human gene (KIAA1051) derived from a Ty3/Gypsy element. Human KIAA1051 has similarity to the gag protein, as well as to the protease domain and a fragment of the reverse transcriptase domain of the pol protein of certain Ty3/Gypsy elements that we have called "chromoviruses" (see Marín and Lloréns 2000
). It is well known that chromoviruses exist in other vertebrate genomes (Malik and Eickbush 1999
; Marín and Lloréns 2000
), and therefore the finding of chromovirus-related sequences in mammals is not totally unexpected. In fact, KIAA1051 lacks introns, and its structure resembles that of a truncated retrotransposon, including overlap of the putative gag and pol genes and an apparent lack of a starting codon for the pol open reading frame (ORF). One of the main results suggesting that KIAA1051 may be an active mammalian gene is the finding of very similar, putatively orthologous ORFs in several other mammalian species, of which the best characterized are mouse sequences (see Volff, Körting, and Schartl 2001
). However, the similarity between the human and mouse sequences is lower than that found for Gin-1 (identity in the gag region is about 70%, and it is about 75% for the pol partial sequence). Moreover, Volff, Körting, and Schartl (2001)
detailed that the structures deduced for the putative orthologs in human and mouse are not totally congruent. Also to be considered is that although Volff, Körting, and Schartl (2001)
did not find sequences related to KIAA1051 in the human genome, we detected in a recently released human clone (accession number AL117190.5) a KIAA1051-related sequence. Interestingly, in the KIAA1051-like sequence found in the AL117190.5 clone, a 1,358-amino-acid-long ORF is found that has highly significant similarity to the pol gene of Sushi, a chromovirus of the fish Fugu rubripes (E value = 7 x 10-59). This ORF is transcribed (some related cDNAs are found in the databases), but the functional meaning of the product it encodes is unclear. For example, it lacks some highly conserved regions found in Ty3/Gypsy elements, such as the characteristic YXDD signature in the reverse transcriptase domain. Thus, although the evidence obtained by Volff, Körting, and Schartl (2001)
is suggestive, it is still an open question whether KIAA1051 and the related sequence in AL117190.5 are bona fide human genes or simply peculiar types of defective, "pseudogenized" chromoviruses with nonfunctional ORFs that have retained a background level of transcription.
We can only speculate about the functions of Gin-1. Its conservation in different mammalian orders and its transcription in many different tissues argues for its having a significant, perhaps even essential, function. An intriguing possibility is that it is involved in repressing retrotransposon and/or retrovirus activity. It is known that an endogenous retrovirus-derived gene, Fv1, is able to confer resistance to murine leukemia viruses in the mouse (reviewed in Stoye 1998
). Gin-1 may be part of an analogous defense system, perhaps having contributed to the virtual absence of Ty3/Gypsy elements in mammalian genomes.
|
Thomas Eickbush, Reviewing Editor
1 Keywords: Gypsy
gene emergence
genome protection
2 Address for correspondence and reprints: Ignacio Marín, Departamento
de Genética, Universidad de Valencia, Calle Doctor Moliner,
50, Burjassot 46100, Valencia, Spain. ignacio.marin{at}uv.es
.
References
Berg D. E, M. M. Howe, eds 1989 Mobile DNA American Society for Microbiology. Washington, D.C
International Human Genome Sequencing Consortium. 2001 Initial sequencing and analysis of the human genome Nature 409:860-921[ISI][Medline]
Khan E., J. P. Mack, R. A. Katz, J. Kulkosky, A. M. Skalka, 1991 Retroviral integrase domains: DNA binding and the recognition of LTR sequences Nucleic Acids Res 19:851-860[Abstract]
Makalowski W., M. S. Boguski, 1998 Evolutionary parameters of the transcribed mammalian genome: an analysis of 2,820 orthologous rodent and human sequences Proc. Natl. Acad. Sci. USA 95:9407-9412
Malik H. S., T. H. Eickbush, 1999 Modular evolution of the integrase domain in the Ty3/Gypsy class of LTR retrotransposons J. Virol 73:5186-5190
Marn I., C. Lloréns, 2000 Ty3/Gypsy retrotransposons: description of new Arabidopsis thaliana elements and evolutionary perspectives derived from comparative genomic data Mol. Biol. Evol 17:1040-1049
Marn I., P. Plata-Rengifo, M. Labrador, A. Fontdevila, 1998 Evolutionary relationships among the members of an ancient class of non-LTR retrotransposons found in the nematode Caenorhabditis elegans Mol. Biol. Evol 15:1390-1402
Miller K., C. Lynch, J. Martin, E. Herniou, M. Tristem, 1999 Identification of multiple Gypsy LTR-retrotransposon lineages in vertebrate genomes J. Mol. Evol 49:358-366[ISI][Medline]
Smit A. F. A., 1999 Interspersed repeats and other mementos of transposable elements in mammalian genomes Curr. Opin. Genet. Dev 9:657-663[ISI][Medline]
Stoye J. P., 1998 Fv1, the mouse retrovirus resistance gene Rev. Sci. Tech 17:269-77[ISI][Medline]
Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 25:4876-4882
Volff J. N., C. Körting, M. Schartl, 2001 Ty3/Gypsy retrotransposon fossils in mammalian genomes: did they evolve into new cellular functions? Mol. Biol. Evol 18:266-270
Wells D. J., 1999 Tdd-4, a DNA transposon of Dictyostelium that encodes proteins similar to LTR retroelement integrases Nucleic Acids Res 27:2408-2415
Xiong Y., T. H. Eickbush, 1990 Origin and evolution of retroelements based upon their reverse transcriptase sequences EMBO J 9:3353-3362[Abstract]