Mosaic Structure and Retropositional Dynamics During Evolution of Subfamilies of Short Interspersed Elements in African Cichlids

Kazuhiko Takahashi1 and Norihiro Okada

Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Yokohama


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
The African cichlid (AFC) family of short interspersed elements (SINEs) is found in the genomes of cichlid fish. The alignment of the sequences of 70 members of this family, isolated from such fish in Africa, revealed the presence of correlated changes in specific nucleotides (diagnostic nucleotides) that allowed us to categorize the various members into six subfamilies, which were designated Af1 through Af6. Dividing the SINE consensus sequence into a 5'-head and 3'-tail region, these subfamilies were defined by various combinations of four types of head region (A–D) and three types of tail region [X, Y, and (YX)], with each region of each type including unique diagnostic nucleotides. The observed structures of the subfamilies Af1 through Af6 were AX, AY, CY, A(YX), BY, and DX, respectively. The formation of such structures might have involved the shuffling of head or tail regions among preexisting and existing (or both) subfamilies of the AFC family (and, probably, even another SINE family or a pseudogene for a tRNA in the case of the Af6 subfamily) by recombination at the so-called core region during the course of evolution. By plotting the timing of the retroposition of individual members of each subfamily on a phylogenetic tree of AFCs, we found that the Af3 and Af6 subfamilies became active only recently in the evolutionary history of these fish. The integrity of the 3'-tails of SINEs, which are, apparently, recognized by reverse transcriptase, has been reported to be indispensable for retention of retropositional activity. Therefore, we postulate that recombination might have been involved in the apparent recent activation of the retroposition of the Af3 and Af6 subfamilies via introduction of active tails (types Y and X, respectively) into potential ancestral sequences that might have had inactive tails. If this hypothesis is correct, shuffling of tail regions among subfamilies by recombination at the core region might have played a role in the recycling of dead copies of AFC SINEs.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Short interspersed elements (SINEs) are retroposons that are dispersed in the genomes of numerous multicellular organisms (Okada 1991a,Citation 1991b;Citation Schmid and Maraia 1992Citation ; Okada and Ohshima 1995Citation ). They are a few hundred nucleotides in length and none of them encodes a protein. They are known to increase their numbers within the genome via retroposition, namely, the reverse flow of genetic information from RNA back into DNA (Weiner, Deininger, and Efstratiadis 1986Citation ). Each SINE has a tRNA-related region within its 5'-region and all SINEs are considered to be derived from tRNAs (Okada et al. 1985Citation ; Sakamoto and Okada 1985Citation ; Okada 1991a,Citation 1991b;Citation Okada and Ohshima 1995Citation ), with the exception, to date, of the Alu family in primates and the B1 family in rodents, which originated from 7SL RNA (Weiner 1980Citation ; Ullu and Tschudi 1984Citation ).

Another structural feature of tRNA-related SINEs is their relationship to long interspersed elements (LINEs), which are also retroposons but which encode the reverse transcriptase that is necessary for their own retroposition. Many examples of sequence homology between the 3'-ends of a SINE family and a LINE family have been reported (Ohshima et al. 1996Citation ; Okada and Hamada 1997Citation ; Terai, Takahashi, and Okada 1998Citation ; Ogiwara et al. 1999Citation ). Thus, it has been proposed that the reverse transcriptase encoded by an active LINE is responsible for the retroposition of a partner SINE that coexists in the same genome and, moreover, that the 3'-end of the SINE is recognized by the LINE-encoded reverse transcriptase during this process (Okada et al. 1997Citation ).

The evolution of sequences within a single family of SINEs can be considered in terms of the formation of subfamilies, each of which uniquely shares correlated changes in nucleotides, in other words, each of which includes certain diagnostic nucleotides. Subfamilies of SINEs appear to have been formed by the accumulation of mutations in a limited number of SINE sequences that were capable of retroposition, and these SINEs are called source genes (Schmid and Maraia 1992Citation ). Subfamily structures have been described for the Alu family in primates (for review, see Jurka and Milosavljevic [1991]Citation ; Shen, Batzer, and Deininger [1991]Citation ; and Deininger and Batzer [1995]Citation ), the B1 family in rodents (Zietkiewicz and Labuda 1996Citation ), the S1 family in cruciferous plants (Lenoir et al. 1997Citation ), and the HpaI family in salmon (Kido et al. 1994Citation , 1995Citation ; Takasaki et al. 1994Citation , 1996Citation ), among others, from alignments of sequences in each family. However, many other reports on SINE families have not focused on such structures, and details of the mechanisms of evolution of such subfamilies remain to be clarified.

The African cichlid (AFC) family (Takahashi et al. 1998Citation ) of SINEs was first characterized in cichlid fish (Family Cichlidae) in the East African Great lakes. These fish are famous for the large number of species, the endemicity of species in each lake, and the diversity that has been acquired by explosive adaptive radiation (Fryer and Iles 1972Citation ; Greenwood 1984Citation ; Coulter 1991Citation ). Recent reports (Takahashi et al. 1998Citation , 2001a,Citation 2001b;Citation Y. Terai et al., unpublished data) have discussed the timing of retroposition of individual members of the AFC SINE family, which includes several dozen orthologous loci, in attempts to elucidate the phylogenetic relationships among cichlids (see Shedlock, Milinkovitch, and Okada [2000]Citation and Shedlock and Okada [2000]Citation for the methodology, which involves polymerase chain reactions [PCRs]). However, the cited analyses focused only on the presence or absence of SINE sequences themselves at genomic loci. The diagnostic nucleotides were not analyzed, although the existence of subfamilies was suggested by a preliminary analysis of a limited number of sequences (Terai, Takahashi, and Okada 1998Citation ). In the present study, we reexamined many sequences in the AFC SINE family and found three divergent subfamilies that had not previously been reported, in addition to the three previously reported subfamilies. We then attempted to clarify the evolution and retropositional dynamics of these SINEs by comparing their sequences and analyzing the timing of their insertion at genomic loci by retroposition.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
The sequences of 70 members of the AFC family that were described in previous reports (Takahashi et al. 1998Citation , 2001a,Citation 2001b;Citation Y. Terai et al., unpublished data) were initially aligned by the CLUSTAL X program (Higgins, Bleasby, and Fuchs 1992Citation ). The alignment was inspected by eye, and gaps were inserted to improve the alignment. The source species, references, and accession numbers of these sequences are given in table 1 . Subfamilies were identified by comparison of diagnostic nucleotides that had been revealed by the alignment. The consensus sequence of each subfamily was determined basically from the most frequent base at each site in the constituent sequences. We did not attempt to define a general consensus sequence from the sequences of all the analyzed members of the AFC family because, as discussed later, part of the sequence of the Af6 subfamily might have been derived from a different family of SINEs or from the pseudogene for a tRNA. We searched the DDBJ-EMBL-GenBank international nucleotide sequence database for the tRNA-related regions (Takahashi et al. 1998Citation ) of the consensus sequences of the subfamilies Af3 and Af6 using the BLAST program (Altschul et al. 1990Citation )


View this table:
[in this window]
[in a new window]
 
Table 1 Subfamilies, Names of Loci, Source Species, References and Accession Numbers of the Sequences Examined in the Present Study

 

    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Identification of Diagnostic Nucleotides and Classification of Sequences into Subfamilies
Previous studies of phylogenetic relationships among AFCs using the SINE method (Takahashi et al. 1998Citation , 2001a,Citation 2001b;Citation Y. Terai et al., unpublished data) included analyses of sequences of SINEs in the AFC family at 70 independent loci. We generated an alignment of all these sequences (Supplementary Material). This alignment allowed us to identify six subfamilies on the basis of diagnostic nucleotides in the analyzed members of this family. Three of the six identified subfamilies were identical to those reported in an earlier study of the AFC family (subfamilies Af1, Af2, and Af3; see Terai, Takahashi and Okada [1998]Citation ). Three other subfamilies were newly identified in this study, and we designated them Af4, Af5, and Af6.

Mosaic Structures of Subfamilies
Considering the distribution of diagnostic nucleotides in the various subfamilies at a total of 93 sites (fig. 1 , panel A), we found that the 5'-regions, which corresponded to positions 1–130 (referred to hereafter as the head region) of subfamilies Af1, Af2, and Af4 had almost identical patterns of diagnostic nucleotides (type A), whereas this region in each of the subfamilies Af3, Af5, and Af6 had a unique and different pattern of such nucleotides (types B, C, and D, respectively). To our surprise, the other regions (positions 131–350; referred to hereafter as the tail region) of the various sequences exhibited a different pattern of relationships among the subfamilies. The subfamilies Af1 and Af6 had very similar diagnostic nucleotides in the entire tail region (type X), whereas subfamilies Af2, Af3, and Af5 shared another pattern of diagnostic nucleotides in this region (type Y). The diagnostic nucleotides in the tail region of the Af4 subfamily were identical to those of type Y at positions 131–247, but those at subsequent positions (248–350) were more similar to those of type X. Therefore, we designated this type of tail region (YX). Our analysis suggested that the subfamilies of AFC SINEs had a mosaic structure, with various combinations of four kinds of head region (types A through D) and three kinds of tail region [types X, Y, and (YX)], each of which was associated with the unique distribution of diagnostic nucleotides (fig. 1 , panel B).



View larger version (40K):
[in this window]
[in a new window]
 
Fig. 1.—Panel A, Diagnostic nucleotides in the consensus sequences of the subfamilies identified in the present study. The numbers on the top line show the nucleotide positions and correspond to those in the original alignment (Supplementary Material). Nucleotides in regions with different combinations of diagnostic nucleotides are highlighted in different colors. Hyphens denote gaps that were introduced to improve the alignment. Panel B, A schematic representation of the mosaic structures of the various subfamilies. The background colors correspond to those in panel A. The tRNA-related region, the core region, and the LINE-related region are indicated by black horizontal bars below the colored bars. The numbers on the top line show approximate nucleotide positions and correspond to those in the original alignment (Supplementary Material), as well as in panel A of this figure

 
Comparison of Sequences in Head Regions
We examined the four types of the head region and found that the diagnostic nucleotides of type B were intermediate between those of types A and C. For example, the nucleotide at position 9 was T in type A but was C in types B and C. By contrast, the nucleotide at position 40 was T in types A and B, whereas type C lacked a corresponding nucleotide (had a gap) at this position. Among these three types, type B uniquely shared diagnostic nucleotides with type A at nine sites and with type C at 10 sites.

In order to characterize the sequences of the head region in greater detail, we subjected the tRNA-related region of each SINE (Takahashi et al. 1998), which corresponded to positions 5–85, to a homology search in the nucleotide sequence database. In figures 2 and 3 , the tRNA-related regions of the Af3 and Af6 subfamilies are, respectively, compared with those of other subfamilies, as well as with genes for tRNAs that were identified in the nucleotide sequence database from their high homology scores. In this comparison, the sequence of the Af3 subfamily (type C) was most similar (81.9%) to the type-A sequence, which was shared by subfamilies Af1, Af2, and Af4. By contrast, the type-C sequence was less similar to the sequences of genes for tRNAs. The similarity scores were 74.0% with a gene for tRNALeu from Spinacia oleracea (AJ400848; Schmitz-Linneweber et al. 2001Citation ), 72.6% with a gene for tRNAThr from Neisseria meningitidis Z2491 (AL162752; Parkhill et al. 2000Citation ), and 68.5% with a gene for tRNAHis from Mycobacterium leprae (U15186). The homology between the Af3 subfamily (type C) and the Af6 subfamily (type D) was also low (68.9%). When the tRNA-related region of the Af6 subfamily was compared with those of other subfamilies, it was clear that the homology was limited (65.3%–69.9%; fig. 3 ). However, much higher homologies were observed with genes for tRNAs: 79.5% with a gene for tRNAAla from Leptospira interrogans serovar (AB024693), 76.7% with a gene for tRNAThr from Caenorhabditis elegans (AF016671; The C. elegans Sequencing Consortium 1998Citation ), and 75.3% with a gene for tRNALys from Zymomonas mobilis (AF088897). These results suggested that types A and C, together with type B, which was intermediate between types A and C, were more closely related to each other than to genes for tRNAs. By contrast, type D was only distantly related to other types of head region in the AFC family but was more similar to genes for several tRNAs.



View larger version (39K):
[in this window]
[in a new window]
 
Fig. 2.—Comparisons of the sequence and structure of the tRNA-homologous region (positions 5–85) of the consensus sequence of type B with those of types A and D, as well as with those of genes for tRNALeu from S. oleracea (AJ400848; Schmitz-Linneweber et al. 2001Citation ), tRNAThr from N. meningitidis Z2491 (AL162752; Parkhill et al. 2000Citation ), and tRNAHis from M. leprae (U15186). In each sequence, nucleotides identical to the type-B consensus sequence are boxed. In the alignment of the sequences, we assumed indels in order to maximize homology. The calculated homologies are indicated at the lower right of each sequence. The numbers along the sequences of members of the AFC family are nucleotide positions that correspond to those in the original alignment (Supplementary Material). The first and second promoter regions for RNA polymerase III were inferred from the sequence reported by Galli, Hofstetter, and Birnstiel (1981)Citation and are indicated in bold letters. Dots denote complementary base pairs in putative stem regions

 


View larger version (39K):
[in this window]
[in a new window]
 
Fig. 3.—Comparisons of the sequence and structure of the tRNA-homologous region (positions 5–85) of the type-D consensus sequence with those of types A and B, as well as with those of the genes for tRNAAla from L. i. serovar (AB024693), tRNAThr from C. elegans (AF016671; The C. elegans Sequencing Consortium 1998Citation ), and tRNALys from Z. mobilis (AF088897). In each sequence, nucleotides identical to the type-D consensus sequence are boxed. For more details, see legend to figure 2

 
Relative Retropositional Activities of Subfamilies at Various Stages in the Evolution of Cichlids
In the present data set, the dominant subfamily was Af1, which included 34 (48.5%) of the 70 SINE sequences analyzed. Af2 was the second most dominant subfamily, including 15 sequences (21.4%). The corresponding values for the other subfamilies were 14.3% for Af3, 10.0% for Af6, and 2.9% for each of Af4 and Af5.

To examine the relative retropositional activities of the subfamilies at various stages in the evolution of cichlids in Africa, we plotted the timing of insertion of all 70 AFC SINEs on a previously established phylogenetic tree (fig. 4 , panel A; Takahashi et al. 1998Citation , 2001a,Citation 2001b;Citation Y. Terai et al., unpublished data). Retroposition of members of the Af1 subfamily was detected at various sites on the tree. In particular, in the part of the tree that corresponded to an early period (stages I–IV), this subfamily accounted for 14 of the 18 (77.8%) members of the AFC family. Retroposition of members of this subfamily was also apparent at all other later stages (V–X) on the tree, as well as at the branch leading to the tribe Lamprologini in Lake Tanaganyika. Retroposition of the examined copies of the Af2 subfamily was evident at stages IV–VII, as well as at the branches leading to the tribes Lamprologini and Ectodini in Lake Tanganyika. This subfamily seems to have been most active at stages V and VI, accounting for nine of the 24 (37.5%) members of the AFC family that were inserted in their specific loci during this period. Retroposition of the examined members of the Af3 subfamily was restricted to the recent part of the phylogenetic tree (stages V–X, as well as the branches leading to the tribes Perissodini and Tropheini in Lake Tanganyika). Members of the Af6 subfamily were found to have been inserted at their specific loci at stages VI–IX, which also correspond to a recent but more limited part of the tree. Retroposition of sequences in the minor subfamilies, that is, the two each copies of the Af4 and Af5 SINE, appeared at stages II–III and VI, respectively.



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 4.—Panel A shows the timing of the insertion of members of the AFC family, as assumed in earlier studies (Takahashi et al. 1998Citation , 2001a,Citation 2001b;Citation Y. Terai et al., unpublished data). The times are indicated by arrowheads on the phylogenetic tree which was based on the same cited studies. Numbers at nodes, internodes, and branches indicate names of loci that have been reported to be sites of insertion of a SINE sequence at the corresponding times in cichlid evolution. Subfamilies of sequences are identified by differently colored backgrounds. Polytomies in the tree indicate the absence of a sufficient number of informative loci to support the corresponding relationships or the existence of incongruities with respect to patterns of insertion of SINEs among loci, possibly as a result of incomplete lineage sorting or interspecific hybridization (Takahashi et al. 2001a,Citation 2001b;Citation Y. Terai et al., unpublished data). Such incongruities have been reported for the relationships indicated by thick vertical lines in the tree. Nodes and internodes were designated stage I through X, in order, from the old part to the recent part of the tree, for convenience. Species names of cichlids in Lake Tanganyika are followed by names of the tribes (Poll 1986Citation ) to which they belong. The cichlids from Lake Malawi in the phylogenetic tree include 16 endemic species analyzed by Takahashi et al. (2001a)Citation . The riverine cichlids that are not shown on the tree include species from nine genera analyzed by Y. Terai et al., unpublished data. Panels B and C similarly show the timing of insertion of members of the AFC family of SINEs, focusing on the diagnostic nucleotides in the head region (panel B) and the tail region (panel C) separately. Types of head and tail regions are identified by differently colored backgrounds

 
As mentioned earlier, each sequence in the AFC family has a mosaic structure with two components, namely, a head region and a tail region, in terms of the combination of diagnostic nucleotides. To determine whether there might be correlations between relative retropositional activities and specific diagnostic nucleotides in the head and in the tail component, we analyzed the timing of retroposition of the head region and the tail region separately in a manner similar to that described earlier.

The results of our analysis of head regions are shown in panel B of figure 4 . In most of this tree, retroposition of sequences with a type-A head region, in terms of diagnostic nucleotides (subfamilies Af1, Af2, and Af4), seemed active. However, the frequency of retroposition of type-A sequences relative to that of other types seemed to have decreased somewhat at more recent stages (VII–X, for example) as a result of retroposition of subfamilies with head regions of types C and D. The observed retroposition of sequences with head regions of types C and D (corresponding to subfamilies Af3 and Af6, respectively) was restricted to recent stages on the tree (stages V–X and VI–IX, respectively), as well as to branches leading to the tribes Perissodini and Tropheini, in the case of the type C. Panel C in figure 4 shows a similar analysis based on the tail regions of the AFC SINEs. In this panel, retroposition of sequences with either an X or a Y type of tail was broadly distributed on various parts of the tree. This observation suggests that sequences with either type of tail region have been active throughout almost the entire investigated evolutionary time frame of the cichlids. Thus, our analyses of heads and tails yielded contrasting results for the relative retropositional frequencies of sequences with different types of head and tail region during cichlid evolution. The frequencies seemed variable when we focused on differences among types of head region but constant when we focused on differences among types of tail region.


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
The Origins of the Sequences of Head Regions
The observed mosaic structures of the subfamilies of AFC SINEs are probably the result of exchanges of head regions or tail regions of source genes among the preexisting or existing subfamilies. Comparison of the tRNA-related sequence in the head regions of types A, B, and C (fig. 2 ) showed that these regions are similar to one another and suggest that they might have been derived from a common ancestral sequence in the AFC family. By contrast, the stronger similarity of the corresponding region of the type-D sequence with genes for tRNAs themselves than to head regions of other types (fig. 3 ) suggests that the type-D sequence might be related to another unknown family of SINEs or to the gene or pseudogene for a tRNA. Because the tail region of the Af6 subfamily exhibited extremely strong homology to the corresponding region of the Af1 subfamily (type X; fig. 1 , panel A), it is likely that the proposed exchange of sequence was relatively recent. By contrast, the maximum homology of the tRNA-related region of the Af6 subfamily (type D) to tRNA genes was only 79.5% (fig. 3 ). This observation supports the hypothesis that the sequence involved in the aforementioned recent event was a SINE or the pseudogene for a tRNA that had already diverged from its original sequence.

Possible Mechanism for and Role of Sequence Exchanges among Subfamilies
The mosaic structures of SINE sequences have been discussed in the reports on S1 elements in cruciferous plants (Lenoir et al. 1997Citation ) and B1 elements in rodents (Zietkiewicz and Labuda 1996Citation ). The authors of the cited reports suggested the possibility that such structures might have been formed by the exchange of sequences among different subfamilies by gene conversion. Kass, Batzer, and Deininger (1995)Citation proposed models for such gene conversion between SINEs. Their models include the following possible scenarios: (1) the cDNA of a SINE is used as a template to repair a double-strand break within the genomic sequence of another SINE and (2) unequal crossing-over, by homologous recombination, occurs between neighboring copies of SINEs. It is possible that gene conversion might also have been responsible for the mosaic structures of the subfamilies examined in the present study. In this case, the border between the head region and the tail region might be a hot spot for such double-strand breakage or recombination. However, gene conversion might not be the only explanation. We postulated that the mechanism of formation of the mosaic structures of AFC SINEs might be related somehow to the sharing by SINEs and LINEs of the retropositional machinery. As mentioned in the Introduction, the reverse transcriptase encoded by a LINE is considered to be responsible for the retroposition of a partner SINE. Moreover, recruitment of the 3'-ends of LINEs by SINEs has been suggested to be a general and important evolutionary phenomenon that allows SINEs to retain their retropositional activity (Ohshima et al. 1996Citation ; Okada et al. 1997Citation ; Terai, Takahashi, and Okada 1998Citation ). Gilbert and Labuda (1999Citation , 2000)Citation found recently that many SINE families in the genomes of diverse animals share a conserved sequence, designated the core, adjacent to the 3'-end of their respective tRNA-related regions. The vast and ancient group of SINEs, which Gilbert and Labuda designated CORE-SINEs, includes various SINE families whose 3'-ends are homologous to those of LINE families. They proposed that the core region has provided CORE-SINEs with the ability to recruit the 3'-ends of active LINEs in various lineages of animals during the course of evolution, giving the SINEs the ability to exploit the active retropositional machinery of the LINEs. As a mechanism for recruitment of the 3'-ends of these LINEs by the SINEs, Gilbert and Labuda (2000)Citation proposed that template switching (Bowman, Hu, and Pathak 1998Citation ) might occur between RNAs transcribed from a LINE and a CORE-SINE at the conserved core domain during the process of reverse transcription.

The AFC family from cichlids is also a member of the vast superfamily of CORE-SINEs (Gilbert and Labuda 1999Citation ). AFC SINEs have a core domain adjacent to the 3'-end of the tRNA-related region and share the sequence of the 3'-end with the CiLINE2 family of LINEs (Terai, Takahashi, and Okada 1998Citation ). We found that the border between the head region and the tail region of the AFC sequence is located within the core domain (fig. 1 , panel B). Thus, if template switching were to have occurred not only between RNAs transcribed from the AFC SINE and CiLINE2 but also between RNAs transcribed from the different subfamilies of AFC SINEs at the core domain, it is easy to see how the mosaic structures of SINE sequences, as observed in the present study, might have been generated. This hypothetical scheme can be summarized as follows. First, the reverse transcriptase encoded by an active copy of the CiLINE2 recognizes the 3'-end of the RNA transcribed from an AFC SINE in a certain subfamily. Second, the reverse transcriptase begins the synthesis of cDNA using the RNA as template. When the reverse transcription has proceeded to the region of the template RNA that corresponds to the core domain, the reverse transcriptase switches its template to the corresponding region of another RNA that has been transcribed from a SINE in a different subfamily of the AFC family. Finally, completion of reverse transcription and integration of the cDNA at a certain locus in the genome create a copy of the AFC family that is a mosaic, representing parts of each of the two subfamilies. If this new mosaic SINE continues to be actively transcribed, it can become the source gene for a new subfamily with a mosaic structure.

The present analysis suggests that different subfamilies were associated with different relative frequencies of amplification at each stage in the evolution of cichlids (fig. 4 , panel A). It has been proposed that the retropositional activity of a SINE is affected by various factors, such as chromatin structure, methylation, cis-acting promoter elements, trans-acting factors, and RNA processing (for review, see Schmid and Maraia 1992Citation ). However, the primary prerequisite for retroposition is the integrity of the SINE sequence itself, even when the environment is ideal with respect to all these factors. Recent evidence for the possible involvement of the reverse transcriptase from a LINE in the retroposition of a SINE implies that the sequence at the 3'-end of the SINE itself, which is recognized by the reverse transcriptase, is especially important for retropositional activity. The observed generality of retroposition of sequences with either an X or a Y type of tail region throughout the cichlid phylogenetic tree (fig. 4 , panel C) seems to support this hypothesis, if we consider that the reverse transcriptase has a relatively strict preference for the sequence in the tail region and, hence, its sequence cannot easily be replaced by another related sequence that has accumulated mutations. By contrast, the observed variability in the relative retropositional activities of sequences with different types of head region at the various stages of the cichlid evolution (fig. 4 , panel B) is at least consistent with the notion that this region has more flexibility in terms of sequence because of weaker constraints on retroposition.

What factors might have influenced the relative retropositional activities of the different subfamilies of AFC SINEs during the course of evolution? Because retroposition of the Af3 and Af6 subfamilies was restricted to the recent part of the tree (fig. 4 , panel A), it seems that their source genes might have become active at around stages V and VI, respectively. The activation of their retroposition might have been related, directly or indirectly, to the emergence of these subfamilies themselves by putative recombination events, such as gene conversion or template switching. In this process, the active tails (types Y and X for subfamilies Af3 and Af6, respectively) were introduced into preexisting ancestral sequences, replacing tails that had been inactivated by the accumulation of mutations. If this hypothesis is correct, the recombination events that were responsible for the observed mosaic structures might have played a role in recycling of dead copies of AFC SINEs, contributing to the diversification of the AFC SINEs in the genomes of cichlids. Finally, this putative mechanism for the recent activation of the Af3 and Af6 subfamilies does not exclude the possibility that certain changes in the local environment of the genome (Schmid and Maraia 1992Citation ; Takasaki et al. 1996Citation ) might have triggered these events. For a further examination of the background associated with the changes in the retropositional activities of the various subfamilies, we need more data on the retropositional dynamics of the AFC SINEs, on mechanisms of regulation of retroposition by the local environment of the genome, and on biochemical interactions between SINEs and the reverse transcriptases encoded by LINEs.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
This work was supported by a Grant-in-Aid for Specially Promoted Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan to N.O.


    Footnotes
 
Pierre Capy, Reviewing Editor

1 Present address: Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut Back

Abbreviations: SINE, short interspersed element; LINE, long interspersed element. Back

Keywords: retroposon AFC SINEs cichlid template switching diagnostic nucleotides Back

Address for correspondence and reprints: Norihiro Okada, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan. nokada{at}bio.titech.ac.jp Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 

    Altschul S. F., W. Gish, W. Miller, E. W. Myers, D. J. Lipman, 1990 Basic local alignment search tool J. Mol. Biol 215:403-410[ISI][Medline]

    Bowman R. R., W. S. Hu, V. K. Pathak, 1998 Relative rates of retroviral reverse transcriptase template switching during RNA- and DNA-dependent DNA synthesis J. Virol 72:5198-5206[Abstract/Free Full Text]

    Coulter G. W., 1991 The benthic fish community Pp. 151–199 in G. W. Coulter, ed. Lake Tanganyika and its life. Oxford University Press, London

    Deininger P. L., M. A. Batzer, 1995 SINE master genes and population biology Pp. 43–60 in R. J. Maraia, ed. The impact of short interspersed elements (SINEs) on the host genome. R. G. Landes Company, Austin, Tex

    Fryer G., T. D. Iles, 1972 The cichlid fishes of the Great Lakes of Africa: their biology and evolution Oliver & Boyd, Edinburgh

    Galli G., H. Hofstetter, M. L. Birnstiel, 1981 Two conserved sequence blocks within eukaryotic tRNA genes are major promoter elements Nature 294:626-631[ISI][Medline]

    Gilbert N., D. Labuda, 1999 CORE-SINEs: eukaryotic short interspersed retroposing elements with common sequence motifs Proc. Natl. Acad. Sci. USA 96:2869-2874[Abstract/Free Full Text]

    ———. 2000 Evolutionary inventions and continuity of CORE-SINEs in mammals J. Mol. Biol 298:365-377[ISI][Medline]

    Greenwood P. H., 1984 African cichlids and evolutionary theories Pp. 141–154 in A. A. Echelle and I. Kornfield, eds. Evolution of fish species flocks. University of Maine at Orono Press, Orono, Me

    Higgins D. G., A. J. Bleasby, R. Fuchs, 1992 CLUSTAL V: improved software for multiple sequence alignment Comput. Appl. Biosci 8:189-191[Abstract]

    Jurka J., A. Milosavljevic, 1991 Reconstruction and analysis of human Alu genes J. Mol. Evol 32:105-121[ISI][Medline]

    Kass D. H., M. A. Batzer, P. L. Deininger, 1995 Gene conversion as a secondary mechanism of short interspersed element (SINE) evolution Mol. Cell. Biol 15:19-25[Abstract]

    Kido Y., M. Himberg, N. Takasaki, N. Okada, 1994 Amplification of distinct subfamilies of short interspersed elements during evolution of the Salmonidae J. Mol. Biol 241:633-644[ISI][Medline]

    Kido Y., M. Saitoh, S. Murata, N. Okada, 1995 Evolution of the active sequences of the HpaI short interspersed elements J. Mol. Evol 41:986-995[ISI][Medline]

    Lenoir A., B. Cournoyer, S. Warwick, G. Picard, J.-M. Deragon, 1997 Evolution of SINE S1 retroposons in Cruciferae plant species Mol. Biol. Evol 14:934-941[Abstract]

    Ogiwara I., M. Miya, K. Ohshima, N. Okada, 1999 Retropositional parasitism of SINEs on LINEs: identification of SINEs and LINEs in elasmobranchs Mol. Biol. Evol 16:1238-1250[Abstract]

    Ohshima K., M. Hamada, Y. Terai, N. Okada, 1996 The 3' ends of tRNA-derived short interspersed repetitive elements are derived from the 3' ends of long interspersed repetitive elements Mol. Cell. Biol 16:3756-3764[Abstract]

    Okada N., 1991a. SINEs Curr. Opin. Genet. Dev 1:498-504[Medline]

    ———. 1991b. SINEs: short interspersed repeated elements of the eukaryotic genome TREE 6:358-361

    Okada N., H. Endoh, K. Sakamoto, K. Matsumoto, 1985 Many highly repetitive and transcribable sequences are derived from tRNA genes Proc. Jpn. Acad. Ser. B 61:363-367

    Okada N., M. Hamada, 1997 The 3' ends of tRNA-derived SINEs originated from the 3' ends of LINEs: a new example from the bovine genome J. Mol. Evol 44:S52-S56[ISI][Medline]

    Okada N., M. Hamada, I. Ogiwara, K. Ohshima, 1997 SINEs and LINEs share common 3' sequences: a review Gene 205:229-243[ISI][Medline]

    Okada N., K. Ohshima, 1995 Evolution of tRNA-derived SINEs Pp. 61–79 in R. J. Maraia, ed. The impact of short interspersed elements (SINEs) on the host genome. R. G. Landes Company, Austin, Tex

    Parkhill J., M. Achtman, K. D. James, et al. (28 co-authors) 2000 Complete DNA sequence of a serogroup A strain of Neisseria menigitidis Z2491 Nature 404:502-506[ISI][Medline]

    Poll M., 1986 Classification des Cichlidae du Lac Tanganyika: tribus, genres et espèces Mem. Cl. Sci. Acad. R. Belg 45:5-163

    Sakamoto K., N. Okada, 1985 Rodent type 2 Alu family, rat identifier sequence, rabbit C family, and bovine or goat 73-bp repeat may have evolved from tRNA genes J. Mol. Evol 22:134-140[ISI][Medline]

    Schmid C., R. Maraia, 1992 Transcriptional regulation and transpositional selection of active SINE sequences Curr. Biol 2:874-882

    Schmitz-Linneweber C., R. M. Maier, J. P. Alcaraz, A. Cottet, R. G. Herrmann, R. Mache, 2001 The plastid chromosome of spinach (Spinacia oleracea): complete nucleotide sequence and gene organization Plant Mol. Biol 45:307-315[ISI][Medline]

    Shedlock A. M., M. C. Milinkovitch, N. Okada, 2000 SINE evolution, missing data, and the origin of whales Syst. Biol 49:808-817[ISI][Medline]

    Shedlock A. M., N. Okada, 2000 SINE insertions: powerful tools for molecular systematics BioEssays 22:148-160[ISI][Medline]

    Shen M. R., M. A. Batzer, P. L. Deininger, 1991 Evolution of the master Alu gene(s) J. Mol. Evol 33:311-320[ISI][Medline]

    Takahashi K., M. Nishida, M. Yuma, N. Okada, 2001a. Retroposition of the AFC family of SINEs (short interspersed repetitive elements) before and during the adaptive radiation of cichlid fishes in Lake Malawi and related inferences about phylogeny J. Mol. Evol 53:496-507[ISI][Medline]

    Takahashi K., Y. Terai, M. Nishida, N. Okada, 1998 A novel family of short interspersed repetitive elements (SINEs) from cichlids: the patterns of insertion of SINEs at orthologous loci support the proposed monophyly of four major groups of cichlid fishes in Lake Tanganyika Mol. Biol. Evol 15:391-407[Abstract]

    ———. 2001b. Phylogenetic relationships and ancient incomplete lineage sorting among cichlid fishes in Lake Tanganyika as revealed by analysis of the insertion of retroposons Mol. Biol. Evol 18:2057-2066[Abstract/Free Full Text]

    Takasaki N., S. Murata, M. Saitoh, T. Kobayashi, L. Park, N. Okada, 1994 Species-specific amplification of tRNA-derived short interspersed repetitive elements (SINEs) by retroposition: a process of parasitization of entire genomes during the evolution of salmonids Proc. Natl. Acad. Sci. USA 91:10153-10157[Abstract/Free Full Text]

    Takasaki N., L. Park, M. Kaeriyama, A. J. Gharrett, N. Okada, 1996 Characterization of species-specifically amplified SINEs in three salmonid species—chum salmon, pink salmon, and kokanee: the local environment of the genome may be important for the generation of a dominant source gene at a newly retroposed locus J. Mol. Evol 42:103-116[ISI][Medline]

    Terai Y., K. Takahashi, N. Okada, 1998 SINE cousins: the 3'-end tails of the two oldest and distantly related families of SINEs are descended from the 3' ends of LINEs with the same genealogical origin Mol. Biol. Evol 15:1460-1471[Free Full Text]

    The C. elegans Sequencing Consortium. 1998 Genome sequence of the nematode C. elegans: a platform for investigating biology Science 282:2012-2018[Abstract/Free Full Text]

    Ullu E., C. Tschudi, (1984) Alu sequences are processed 7SL RNA genes Nature 312:171-172.[ISI][Medline]

    Weiner A. M., 1980 An abundant cytoplasmic 7SL RNA is complementary to the dominant interspersed middle repetitive DNA sequence family in the human genome Cell 22:209-218[ISI][Medline]

    Weiner A. M., P. L. Deininger, A. Efstratiadis, 1986 Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information Ann. Rev. Biochem 55:631-661[ISI][Medline]

    Zietkiewicz E., D. Labuda, 1996 Mosaic evolution of rodent B1 elements J. Mol. Evol 42:66-72[ISI][Medline]

Accepted for publication April 4, 2002.