Section on Genomic Structure and Function, Laboratory of Molecular and Cellular Biology, National Institutes of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Whatever the case, the fact that L1 activity is deleterious enough to be subject to purifying selection suggests that control of L1 transposition would be important for maintaining host fitness. Aside from evidence suggesting that the general transcriptional inhibition imposed by DNA methylation (Jones 1999
) could as well silence L1 transcription (Nur, Pascale, and Furano 1988
; Thayer, Singer, and Fanning 1993
; Hata and Sakaki 1997
; Woodcock et al. 1997
), neither the regulation of L1 replication nor any other possible role played by the host has been examined in detail. Host factors that specifically repress or reduce L1 activity would be highly advantageous. In turn, such factors would constitute a selective pressure on L1 to evade repression. Thus, L1 evolution may in part reflect interactions between the element and its host.
To identify regions of L1 that could be involved in host-L1 interactions, we examined the evolutionary changes that occurred in the evolution of the active lineage of L1 elements from the ancestral L1PA5 family to the currently active L1PA1 family. In particular, we identified a region of the first open-reading frame (ORFI) that uniquely shows a high rate of nonsynonymous (amino acid replacement) substitutions, which is the typical signature of positive selection. The fact that this region of ORFI encodes a coiled coil domain that has been shown to mediate protein-protein interaction (Hohjoh and Singer 1996
; Martin, Li, and Weisz 2000
), suggests that the ORFI protein (ORFIp) could be involved in host-L1 interaction.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
ORFIp was analyzed for coiled coil domains using the program COILS (Lupas, Van Dyke, and Stock 1991
) at http://www.ch.embnet.org/software/COILS_form.html. COILS compares a given sequence to a database of sequences which are known to form coiled coil structures. COILS calculates the probability that the sequence of interest will adopt a coiled coil conformation.
Test for Selection
The effect of selection on a coding sequence can be estimated by comparing the synonymous (dS) and nonsynonymous (dN) substitution rates (for a review, see Yang and Bielawski 2000
). The value of the ratio
= dN/dS is an indicator of the type and strength of selection. If nonsynonymous mutations have no effect on fitness, they are going to be fixed at the same rate as synonymous mutations and a value of
= 1 is expected. If nonsynonymous mutations are deleterious, they are going to be fixed at a lower rate than synonymous mutations (i.e., negative or purifying selection) and
will be <1. If nonsynonymous mutations are advantageous, they are going to be fixed faster than synonymous mutations (i.e., positive or adaptive selection) and
will be >1. The parameter
was estimated using the ML method of Goldman and Yang (1994)
. In this method, parameters of a model of codon substitution are estimated from the data by ML and are used to calculate dN and dS. To test if dN is significantly different from dS,
was fixed at 1 in the null model (i.e., neutrality), whereas
was estimated as a free parameter in the alternative model (Yang 1998
). The double of the log-likelihood difference between the two models is compared with a
2 distribution with one degree of freedom to test whether
is different from 1. All these calculations were performed using the codem1 program of the PAML package (Yang 2000
).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
We distinguished between these possibilities by analyzing the nonsynonymous to synonymous rate ratio (, see Materials and Methods). The parameter
was calculated independently for the coiled coil encoding region (from codon 51 to 148) and for the entire coding sequence (ORFI and ORFII), excluding the coiled coil domain (table 3
). Because some of the pairwise comparisons in the coiled coil domain include very few if any synonymous substitutions, ML estimates of the parameter
(=transition-transversion rate ratio for synonymous substitutions) were first estimated based on the complete coding sequence. Values of
for the entire L1 range from 1.8 to 3.1 and several values within this range were incorporated into the ML model to calculate
. These analyses gave congruent results (only values with
= 2.5 are shown in table 3
). Pairwise comparisons among L1PA5, L1PA4, and L1PA3B give values for
significantly higher than 1, indicating that nonsynonymous mutations have been fixed at a faster rate than synonymous mutations (i.e., faster than if they had been neutral). Thus, by these criteria, the coiled coil domain of ORFI has evolved under positive selection during the evolution of L1PA5 to L1PA3B. Although higher than 1, values of
are lower when L1PA5 or L1PA4 are compared with L1PA2 or L1PA1 because of purifying selection acting between the evolution of L1PA3B to L1PA1 (see later). Purifying selection increases the relative number of synonymous substitutions between the older (L1PA5 or L1PA4) families and the younger (L1PA2 or L1PA1) families. As the number of nonsynonymous mutations between the older and younger families has hardly changed, the value of
derived from these comparisons is lower.
|
By comparison, the amino acid sequence outside the coiled coil domain has been always highly conserved (table 3
, above diagonal); in all comparisons is significantly lower than 1. This low rate of amino acid replacement indicates that strong purifying selection has been acting on most regions of the L1 proteins. In ORFII, sequence conservation is not limited to the endonuclease (EN) and reverse transcriptase (RT) encoding domains (fig. 1C
). The segments that separate EN, RT, and the 3' terminal region of ORFII are also very conserved, suggesting that these regions are functionally important because they either encode for some yet to be described function or they play a role in the conformation of the ORFII protein.
Pattern of Amino Acid Replacements in the Coiled Coil Domain
Coiled coil structures are formed by the intertwining of two or more -helical peptide chains that have a repeating arrangement of nonpolar side chains (reviewed in Lupas 1996
). Typically, domains that can form coiled coil structures consist of seven-residue repeats (heptads), with nonpolar or hydrophobic residues in the first (a) and fourth (d) positions of the heptad (fig. 2
). The coiled coil domain of ORFIp ranges from amino acid 52 to 131 and consists of a first group of four or five heptads (depending on the family) separated from a group of six heptads by three amino acids (fig. 2
). The COILS program indicates a 90%100% probability that these heptads will adopt a coiled coil conformation (Lupas, Van Dyke, and Stock 1991
).
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Because many aspects of L1 biology are still unknown, we can only speculate about the possible causes of positive selection. As the 5'UTR (untranslated region) of L1 evolves at a very high rate, including its wholesale replacement (Adey et al. 1
994; Furano 2000
), adaptive changes in ORFI could be a response to changes in the 5'UTR. Although the number of base changes between the 5'UTR of L1PA4 and L1PA3B and between L1PA2 and L1PA1 are roughly the same, ORFI has evolved under positive selection between L1PA4 and L1PA3B but remained very conserved between L1PA2 and L1PA1. Thus, positive selection in ORFI is not correlated with the global rate of base substitution in the 5'UTR. Although changes in the amino acid sequence of the ORFI may be a response to particular sequence changes in the 5'UTR, the selective pressure on ORFI may also lie elsewhere.
Most of the genes for which positive selection has been documented are involved in interactions between the organism and its environment (see Yang and Bielawski 2000
). By analogy, we propose that positive selection in L1 may reflect an interaction between the L1 element (the organism) and the host (its environment). For instance, the rapid evolution of the coiled coil domain could have been driven by L1 adaptation to a host factor required by L1 for replication. Rapid evolution of the putative host factor might have occurred for a number of reasons, including avoidance of recruitment by L1. Alternatively, rapid evolution of the coiled coil domain could have resulted from the evasion by L1 of a host-encoded repressor of L1 replication. This would be similar to positive selection in pathogenic genes that evade a host's immune system (Zanotto et al. 1999
; Haydon et al. 2001
).
In both cases, the alternation between periods of positive and purifying selection on ORFI can be correlated with changes in L1 activity. In rodents (Pascale, Valle, and Furano 1990
) and primates (unpublished data), L1 activity (amplification) is episodic, and therefore its deleterious effect on the host changes over time. Possibly, very active (deleterious) families would induce a strong response by the host, leading to intense positive selection for both the host and the element. Conversely, families that generate just enough copies to persist in the genome, but not enough to cause serious damage, would probably be ignored by the host, and the action of positive selection would be very limited.
Figure 2
shows that positive selection in ORFI resulted in substitutions among amino acids that share similar physicochemical properties. Therefore, the effects of positive selection on the coiled coil domain have been limited by structural constraints, i.e., the ability to form a coiled coil structure. This suggests that the potential to form a coiled coil structure is an important functional feature of ORFIp. This conclusion is supported by the fact that, although the N-terminal one-third of ORFI shows no sequence homology among murine rodents (old world rats and mice), rabbits, galagos, and humans (Kolosha and Martin 1997
), all possess the potential to form coiled coil structures (data not shown; Martin, Li, and Weisz 2000
). The ability of ORFIp to form a coiled coil structure is also shared by nonmammalian L1-like elements, like the Xenopus Tx1L (cited in Pont-Kingdon et al. 1997
), the teleost Swimmer (Duvernell and Turner 1998
), and the bird CR1 elements (Haas et al. 1997
, unpublished data). Coiled coils often mediate protein-protein interactions with themselves or other proteins. In mouse and human L1, the coiled coil domain mediates ORFIp binding to itself (Hohjoh and Singer 1996
; Martin, Li, and Weisz 2000
) but the possibility of interactions with other proteins has not been explored. The ORFIps of two divergent mouse L1 families (Tf and L1MdA) readily interact (Martin, Li, and Weisz 2000
) suggesting that conservative changes in the coiled coil domain would not significantly affect ORFIp interaction with itself. Thus interactions between ORFIp and other proteins could well be responsible for positive selection on ORFI.
![]() |
Footnotes |
---|
Keywords: L1/LINE-1
human
retrotransposon
positive selection
Address for correspondence and reprints: Anthony V. Furano, NIH, Building 8, Room 203, 8 Center DR MSC 0830, Bethesda, Maryland 20892-0830. avf{at}helix.nih.gov
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Adey N. B., S. A. Schichman, D. K. Graham, S. N. Peterson, M. H. Edgell, C. A. I. Hutchison, 1994 Rodent L1 evolution has been driven by a single dominant lineage that has repeatedly acquired new transcriptional regulatory sequences Mol. Biol. Evol 11:778-789
Boissinot S., P. Chevret, A. V. Furano, 2000 L1 (LINE-1) retrotransposon evolution and amplification in recent human history Mol. Biol. Evol 17:915-928
Boissinot S., A. Entezam, A. V. Furano, 2001 Selection against deleterious LINE-1-containing loci in the human lineage Mol. Biol. Evol 18:926-935
Cabot E. L., B. Angeletti, K. Usdin, A. V. Furano, 1997 Rapid evolution of a young L1 (LINE-1) clade in recently speciated Rattus taxa J. Mol. Evol 45:412-423[ISI][Medline]
Duvernell D. D., B. J. Turner, 1998 Swimmer 1, a new low-copy-number LINE family in teleost genomes with sequence similarity to mammalian L1 Mol. Biol. Evol 15:1791-1793
Furano A. V., 2000 The Biological properties and evolutionary dynamics of mammalian LINE-1 retrotransposons Prog. Nucleic Acid Res. Mol. Biol 64:255-294[ISI][Medline]
Goldman N., Z. Yang, 1994 A codon-based model of nucleotide substitution for protein-coding DNA sequences Mol. Biol. Evol 11:725-736
Grassly N. C., E. C. Holmes, 1997 A likelihood method for the detection of selection and recombination using sequence data Mol. Biol. Evol 14:239-247[Abstract]
Haas N. B., J. M. Grabowski, A. B. Sivitz, J. B. Burch, 1997 Chicken repeat 1 (CR1) elements, which define an ancient family of vertebrate non-LTR retrotransposons, contain two closely spaced open reading frames Gene 197:305-309[ISI][Medline]
Hardies S. C., S. L. Martin, C. F. Voliva, C. A. Hutchison III,, M. H. Edgell, 1986 An analysis of replacement and synonymous changes in the rodent L1 repeat family Mol. Biol. Evol 3:109-125[Abstract]
Hata K., Y. Sakaki, 1997 Identification of critical CpG sites for repression of L1 transcription by DNA methylation Gene 189:227-234[ISI][Medline]
Haydon D. T., A. D. Bastos, N. J. Knowles, A. R. Samuel, 2001 Evidence for positive selection in foot-and-mouth disease virus capsid genes from field isolates Genetics 157:7-15
Hohjoh H., M. Singer, 1997 Sequence specific single-strand RNA-binding protein encoded by the human LINE-1 retrotransposon EMBO J 16:6034-6043
Hohjoh H., M. F. Singer, 1996 Cytoplasmic ribonucleoprotein complexes containing human LINE-1 protein and RNA EMBO J 15:630-639[Abstract]
Howell R., K. P. Usdin, 1997 The ability to form intrastrand tetraplexes is an evolutionarily conserved feature of the 3' end of L1 retrotransposons Mol. Biol. Evol 14:144-155[Abstract]
Jones P. A., 1999 The DNA methylation paradox Trends Genet 15:34-37[ISI][Medline]
Kazazian H. H. Jr.,, J. V. Moran, 1998 The impact of L1 retrotransposons on the human genome Nat. Genet 19:19-24[ISI][Medline]
Kolosha V. O., S. L. Martin, 1995 Polymorphic sequences encoding the first open reading frame protein from LINE-1 ribonucleoprotein particles J. Biol. Chem 270:2868-2873
. 1997 In vitro properties of the first ORF protein from mouse LINE-1 support its role in ribonucleoprotein particle formation during retrotransposition Proc. Natl. Acad. Sci. USA 94:10155-10160
Lander E. S., L. M. Linton, B. Birren, et al. (100 co-authors) 2001 Initial sequencing and analysis of the human genome Nature 409:860-921[ISI][Medline]
Li W.-H., 1997 Molecular evolution Sinauer Associates, Sunderland, Mass.
Lupas A., 1996 Coiled coils: new structures and new functions Trends Biochem. Sci 21:375-382[ISI][Medline]
Lupas A., M. Van Dyke, J. Stock, 1991 Predicting coiled coils from protein sequences Science 252:1162-1164[ISI][Medline]
Malik H. S., W. D. Burke, T. H. Eickbush, 1999 The age and evolution of non-LTR retrotransposable elements Mol. Biol. Evol 16:793-805[Abstract]
Martin S. L., F. D. Bushman, 2001 Nucleic acid chaperone activity of the ORF1 protein from the mouse LINE-1 retrotransposon Mol. Cell. Biol 21:467-475
Martin S. L., J. Li, J. A. Weisz, 2000 Deletion analysis defines distinct functional domains for protein-protein and nucleic acid interactions in the ORF1 protein of mouse LINE-1 J. Mol. Biol 304:11-20[ISI][Medline]
Mayorov V. I., I. B. Rogozin, L. R. Adkison, 1999 Characterization of several LINE-1 elements in Microtus kirgisorum Mamm. Genome 10:724-729[ISI][Medline]
Nur I., E. Pascale, A. V. Furano, 1988 The left end of rat L1 (L1Rn, long interspersed repeated) DNA which is a CpG island can function as a promoter Nucleic Acids Res 16:9233-9251[Abstract]
Pascale E., C. Liu, E. Valle, K. Usdin, A. V. Furano, 1993 The evolution of long interspersed repeated DNA (L1, LINE 1) as revealed by the analysis of an ancient rodent L1 DNA family J. Mol. Evol 36:9-20[ISI][Medline]
Pascale E., E. Valle, A. V. Furano, 1990 Amplification of an ancestral mammalian L1 family of long interspersed repeated DNA occurred just before the murine radiation Proc. Natl. Acad. Sci. USA 87:9481-9485[Abstract]
Pont-Kingdon G., E. Chi, S. Christensen, D. Carroll, 1997 Ribonucleoprotein formation by the ORF1 protein of the non-LTR retrotransposon Tx1L in Xenopus oocytes Nucleic Acids Res 25:3088-3094
Smit A. F. A., 1999 Interspersed repeats and other mementos of transposable elements in mammalian genomes Curr. Opin. Genet. Dev 9:657-663[ISI][Medline]
Smit A. F. A., G. Tóth, A. D. Riggs, J. Jurka, 1995 Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences J. Mol. Biol 246:401-417[ISI][Medline]
Thayer R. E., M. F. Singer, T. G. Fanning, 1993 Undermethylation of specific LINE-1 sequences in human cells producing a LINE-1-encoded protein Gene 133:273-277[ISI][Medline]
Voliva C. F., C. L. Jahn, M. B. Comer, C. A. Hutchison III,, M. H. Edgell, 1983 The L1Md long interspersed repeat family in the mouse: almost all examples are truncated at one end Nucleic Acids Res 11:8847-8859[Abstract]
Woodcock D. M., C. B. Lawler, M. E. Linsenmeyer, J. P. Doherty, W. D. Warren, 1997 Asymmetric methylation in the hypermethylated CpG promoter region of the human L1 retrotransposon J. Biol. Chem 272:7810-7816
Yang Z., 1998 Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution Mol. Biol. Evol 15:568-573[Abstract]
. 2000 PAML (phylogenetic analysis by maximum-likelihood) Version 3.0 University College, London
Yang Z., J. P. Bielawski, 2000 Statistical methods for detecting molecular adaptation Trends Ecol. Evol 15:496-503[ISI][Medline]
Zanotto A. M. d. A., E. G. Kallas, R. F. de Souza, E. C. Holmes, 1999 Genealogical evidence for positive selection in the nef gene of HIV-1 Genetics 153:1077-1089