From the
Group I and group II introns, which splice via RNA-catalyzed
pathways, can invade DNA sequences by virtue of proteins expressed from
open reading frames (ORFs) ()contained within them. These
intron products are endonucleases in the case of group I introns and
reverse transcriptases (RTs) with associated endonuclease activity in
the case of the group II introns. Two kinds of mobility reactions will
be considered for each intron type: homing into cognate intronless
alleles and transposition to non-allelic sites.
Intron homing is a process whereby the intron moves from an intron-containing allele to an intronless allele in a homologydependent gene conversion event. Coconversion of flanking exon sequences often accompanies intron homing (reviewed in (1) and (2) ).
Figure 1:
DNA-mediated group I intron homing. A, overview of mobility pathway. Cleavage of the recipient by
the endonuclease (ENDO) results in intron inheritance via gene
conversion. B, the DSBR pathway. Subsequent to endonuclease
cleavage the recipient allele undergoes exonucleolytic degradation and
homologous sequence alignment with an intron-containing donor (a). A 3`-end of the recipient invades the donor, which serves
as a template for repair synthesis (b). During DSBR (c-g), DNA synthesis through the intron results in
expansion of a D-loop (c), which then serves as a substrate
for repair synthesis of the non-invading strand (d). Holliday
junctions formed during this process (e) are resolved to
produce either non-crossover (f) or crossover (g)
products. Activities involved in the T4 intron homing pathway are
indicated, with Polymerase plus signifying the requirements
for polymerase accessory functions (3) ; presumably
similar activities participate in homing events in other systems. C, the SDSA (BM) model (6, 7) .
As DNA synthesis proceeds, the replication bubble migrates with
the replicative end (c`-e`). The newly synthesized
strand is released from the donor (f`) and serves as template
for repair of the noninvading strand (f`-g`) to generate
non-crossover products only (g`).
Half-arrows indicate 3`-ends of DNA strands.
Experiments in the T4 phage system have implicated exonucleolytic,
synaptic, and DNA-synthetic functions of the phage in the homing
process (3) ()(Fig. 1B). A role for
DNA ligase and resolvase was also established. However, a reduced level
of homing can occur in the absence of known resolvases, implying the
existence of alternative resolution enzymes or additional pathways for
intron homing. The latter possibility is favored by the
underrepresentation of crossover events among the homing
products.
One such alternative pathway is the
synthesis-dependent strand annealing (SDSA) model (reviewed in (5) ), which has been invoked as an alternative to the DSBR
pathway to explain gene conversion from ectopic sites in
P-element-induced gap repair in Drosophila (Fig. 1C)(6) . This pathway is similar to
the bubble migration (BM) pathway for T4 phage replication(7) .
The initial steps of the SDSA (and BM) pathway involving cleavage and
strand invasion are the same as those of DSBR. Unlike DSBR, however,
Holliday junctions are not formed, obviating the need for resolvase
function and resulting only in non-crossover
products(5) .
Alternative homing pathways in both
pro- and eukaryotic systems require further investigation.
The group II homing pathway (Fig. 2A), which has been elucidated recently for the aI2 intron, is remarkable in that it depends on three activities of the aI2-encoded protein: endonuclease, maturase, and RT(9, 10) . The endonuclease activity, which also requires the intron RNA, makes a staggered cut in the recipient DNA, cleaving the antisense strand at a specific site in the downstream exon and the sense strand at the junction between the two exons. The 3`-OH of the cleaved antisense strand is used as a primer for first-strand cDNA synthesis by RT, using the pre-mRNA precursor as template(10) . This sequence of events is rather analogous to the priming of cDNA synthesis by the site-specific non-long terminal repeat retroelement R2Bm(11) , consistent with the group II introns representing a type of site-specific retrotransposon.
Figure 2: RNA-mediated mobility events. Wavy lines, RNA; straight lines, DNA; thin lines, exons; thick lines, intron. A, group II intron homing. The pathway, worked out for the yeast mitochondrial aI2 intron (8, 9, 10) involves the maturase (M), endonuclease (E), and RT activities of the aI2-encoded protein. The activity required for each step is indicated as white lettering on a black background. Endonuclease activity requires intron RNA, which is excised as a lariat but is depicted as a linear molecule. Cleavage sites on the DNA recipient are shown as dots, with the exon junction represented as a line. RT-directed cDNA synthesis is primed from the downstream 3`-OH of the cleaved recipient DNA, with the pre-mRNA as template. Details on completion of the integration step remain to be determined. B, group I and group II intron transposition. Reverse splicing into a foreign RNA, shown to occur for both intron types, yields an RNA that, when reverse-transcribed and recombined with the genome, results in transposition of the intron to an ectopic site. The latter step has some experimental support for group II introns only (22, 23, 24) . C, group I and group II intron loss. A cDNA copy of the spliced mRNA is proposed to recombine with the genome to render it intron-minus.
This model is in accord with several unexplained observations. First, inhibition of splicing blocks intron homing(9, 12) . The finding that specific defects of the cis-acting intron RNA substructure abolish homing (9) can be reconciled with the role of intron RNA in endonuclease function as either cofactor or catalyst(10, 13) . Second, the model can explain the relatively inefficient co-conversion downstream of the cleavage site as resulting from the limited exonucleolytic degradation that can occur before priming of cDNA synthesis ensues from the downstream 3`-OH of the cleaved recipient. One issue that remains unclear is the manner in which the complement to the first strand cDNA is made. Another is the possibility that group II homing may occur by more than one mechanism. For example, an RT-independent group I-type pathway was suggested by the finding that a mutant aI2 protein that lacks RT activity but retains endonuclease function supports 40% homing activity(9, 10) . However, further study is needed to determine whether that pathway occurs in wild-type crosses.
The sporadic distribution of the conserved group I and group II introns suggests that each arose from ancestral introns that transposed to heterologous sites. One possibility is that the introns transpose by a degenerate homing event with relaxed homology requirements, resulting in illegitimate recombination. Although no true transposition events by this pathway have yet been documented, evidence for transposition via reverse splicing into foreign RNAs is accumulating for group I and particularly for group II introns. The pathway involves reverse splicing of an excised intron into non-allelic RNA, followed by transfer into the genome at the heterologous site, most likely via a cDNA copy of the recombinant RNA (Fig. 2B).
The first demonstration of a partial reverse splicing reaction in vivo was with the Cr.LSU intron, a chloroplast group I intron from Chlamydomonas reinhardtii(17) . This intron has been shown to undergo the first step of reverse splicing into the cytoplasmic 5.8 S rRNA of its host in vivo and in vitro. Minor changes in the 5.8 S sequence would allow complete integration of the intron. Nevertheless, how cDNA synthesis would ensue for introns that do not encode their own RT activity remains unclear, although a role for trans-acting cellular RTs can be readily envisaged.
The preferential occurrence of group I introns in rRNA and tRNA genes may reflect the abundance of these RNAs, which could provide copious targets for reverse splicing in vivo(17) . Although integration into these targets may be a common phenomenon(14) , maintenance of stable RNA function and trapping of the intron insertion event through capture by the genome (e.g. via a cDNA intermediate) are likely to be rare. Such infrequent events would be most likely to occur with abundant RNAs.
Several recent studies
aimed at understanding site-specific deletions of fungal mtDNAs led to
the discovery that group II introns that encode RT-like proteins can
transpose to ectopic sites in mtDNAs. For example, intron 1 of the COX1 gene of yeast mtDNA, a group II intron that can also
carry out site-specific homing(8) , has been inferred to
reverse splice into several sites in a group I intron of the COX1 gene, aI5, to form twintrons with an internal group II intron
and an external group I intron(22) . A similar intron in Podospora mtDNA was found to transpose to a site in mtDNA near
a tRNA gene(23) , whereas the group II intron of Schizosaccharomyces pombe mtDNA was found to transpose to
multiple sites(24) . In each case, an RNA-mediated event
involving RT was inferred (Fig. 2B). After ectopic
insertion, the genome would contain two copies of the intron, so that
homologous recombination would result in the deletion of one intron
copy plus sequences between them. This type of deletion event could
explain the circular
Sen DNA, which contains a group II intron and
is involved in the senescence phenomenon in Podospora(23) (reviewed in (2) ).
Although no cDNA intermediate for these transposition events has yet been demonstrated, other studies in yeast mitochondria support the inference that such cDNAs can be made. For example, the RT activity overproduced in a mutant of the aI2 intron deleted for a catalytic domain (domain 5) is much less specific for aI2 and, instead, uses other mitochondrial RNAs as a template(9) . Furthermore, reverse transcription and cDNA synthesis are strongly implicated in the intron loss phenomenon (Fig. 2C), which has been reported for both group I and group II introns of the COB and COX1 genes of yeast mtDNA. These events are detected among revertants of some intron mutants and always involve loss of the mutant intron. Frequently adjacent (and unmutated) introns are lost simultaneously (for example, see (25) ). Since the exons separating the lost introns are retained, it was proposed that spliced mRNAs are reverse transcribed and recombined into mtDNA, resulting in loss of the introns. The RT-encoding group II introns are the likely source of the RT activity since strains lacking both aI1 and aI2 do not undergo intron loss(25) .
Proteins That Promote Intron Mobility
The proteins encoded by the group I introns comprise four
families of endonucleases, whereas the group II intron proteins form
one fairly homogenous class of RT-like proteins. Interestingly, a
relationship has been established between one of the group I intron
endonuclease families and a Zn finger-like motif in
some group II intron-encoded proteins(26, 27) . The
recently discovered group II intron endonuclease activity is likely to
be associated with this motif(10) . Additionally, several of
the proteins encoded by both group I and group II introns have maturase
function to promote splicing of their cognate intron (reviewed in Refs.
2 and 28). The reading frames of the group I and group II introns may
occur in freestanding form within the intron or in-frame with the
upstream exon (reviewed in (28) ). In the latter case the
precursor protein appears to be processed proteolytically to release
the active intron-encoded
protein(2, 28, 29, 30) .
Figure 3:
Conserved endonuclease motifs in
intron-encoded and related proteins. Numbers in brackets indicate position in protein, counting from the beginning of the
intron reading frame; the last bracket contains the number of
amino acids to the end of the protein, when known. Dots correspond to gaps in the alignment. In each case, the amino acids
for which the motif is named are indicated in black boxes above the compilations. Highly conserved amino acids are indicated
below the compilations (Consensus). Uppercase,
invariant; lowercase, conserved in >66.7% of cases. Letters in parentheses indicate acidic residues (D or
E) or hydrophobic residues (I, L, or V) in upper- or lowercase
depending on whether they collectively represent 100% or >66.7% of
residues at a particular position, respectively. A, the
LAGLIDADG motif. The consensus was derived from the depicted sequences,
which represent proteins with demonstrated endonuclease activity. I, intron-encoded; PI, intein; HO Endo and Endo.SceI, non-intron endonucleases of Saccharomyces
cerevisiae. The compilation is modified from (28) , which
identifies all endonucleases except I-PorI (51) and
PI-PspI(52) . B, the GIY-YIG motif.
Consensus-17 was derived by the Pileup program of the GCG sequence
analysis package and visual examination of 17 aligned sequences
((53-55); Mary E. Bryk, personal communication). Consensus-3 was
derived from the three GIY-YIG proteins with demonstrated endonuclease
activity. Dashes correspond to a non-conserved
19-31-amino acid block with each sequence containing an arginine
residue. C, the H-N-H motif. The consensus motif was derived
from 40 proteins as described(26, 27) . Only the group
II bacterial ORFs and those proteins with demonstrated endonuclease
activity are listed: colicins and endonuclease McrA from Escherichia coli (Eco), bacteriophage group I intron
endonucleases I-HmuI and I-TevIII(28) ,
bacterial group II intron ORFs from Calothrix PCC7601 (Cpc) and Azotobacter vinelandii (Avi)(4) , chloroplast group II intron ORF from
the blue-green alga Scenedesmus obliquus (Sob), and
the S. cerevisiae mitochondrial group II protein from the aI2
intron (Sce). The Cs of a CXXC component of a
putative Zn finger located upstream of the H-N-H
motif and (H/C)X
C immediately preceding
HX
H of the H-N-H motif are boxed. D, the His-Cys box. Alignment is from (38) .
The Zn domains of group II intron ORFs
were previously proposed to resemble the Zn finger of
retroviral integrases (45) ; that similarity, however, is
probably superseded by the definition of the H-N-H family of proteins.
Furthermore, the Zn
finger is the most amino-terminal
of three integrase domains, whereas in group II introns the Zn domain
is the carboxyl-terminal domain, and there are no cognates in group II
ORFs for the other integrase domains. The relationship between the Zn
domain with its two conserved CXXC or HXXC motifs and
the H-N-H motif in the group II introns is shown in Fig. 3C.
What might be the unifying feature behind two different types of introns being capable of movement by at least two disparate pathways driven by distinct intron-encoded enzymatic activities? The underpinnings of this situation likely reside in two basic premises. First, the intron-encoded functions originated from non-intronic sources. Presumably they were capable of mobilizing their own coding sequences, and that ability eventually led to their colonizing introns. Second, self-splicing introns might provide convenient havens for invasive genetic elements, as insertion into non-essential regions of introns would not be expected to have catastrophic consequences. Thereby, both intron types, rather than fall victim to invasion, acquired the potential for mobility. The intron-ORF invasion hypothesis (46, 47) has gained credence with the discovery of LAGLIDADG endonucleases in the structurally and mechanistically distinct archaeal introns(48) . Further support for the hypothesis derives from lengthy endonuclease recognition sequences flanking corresponding intron ORFs (49, 50) . Nevertheless, the reasons specific endonuclease families are confined to group I introns whereas RT activity is associated strictly with group II introns remain obscure.