(Received for publication, June 22, 1995; and in revised form, September 7, 1995)
From the
Long interspersed nuclear elements (LINEs) are mobile DNA
elements which propagate by reverse transcription of RNA intermediates.
LINEs lack long terminal repeats, and their expression is controlled by
promoters located inside to the transcribed region of unit-length DNA
copies. Doc elements constitute one of the seven families of LINEs
found in Drosophila melanogaster. Plasmids in which the
chloramphenicol acetyltransferase (CAT) gene is preceded by DNA
segments from different Doc family members were used as templates for
transient expression assays in Drosophila S2 cells.
Transcription is initiated at the 5` end of Doc elements within
hexamers fitting the consensus (C/G)AYTCG and is regulated by a DNA
region which is located 20 base pairs (bp) downstream from the RNA
start site(s). The region includes a sequence (RGACGTGY motif, or DE2)
which stimulates transcription in other Drosophila LINEs, and
two adjacent elements, DE1 and DE3. Moving the downstream region either
4 bp away from, or 5 bp closer to the RNA start site region inhibited
transcription. Sequences located
200 bp downstream from the Doc 5`
end repressed CAT expression in an orientation- and position-dependent
manner. The inhibition reflects impaired translation of the CAT gene
possibly consequent to the interaction of specific Doc RNA sequences
with a cellular component.
Doc is one of 50 or more mobile DNA elements that have been
identified in the fruit fly Drosophila melanogaster (Finnegan
and Fawcett, 1986). Doc elements lack terminal repeats, instead
terminating at the 3` end in runs of adenine residues flanked by
polyadenylation signals. They differ in size, being variously truncated
at the 5` end, and are flanked by target site duplications which vary
in length from 10 to 14 bp ()(Schneuwly et al.,
1987; Driver et al., 1989). Complete family members are
4.7 kb in length and potentially encode a putative nucleic acid
binding protein and a reverse transcriptase (O'Hare et
al., 1991). The structure and coding capacity of Doc is typical of
LINEs, nomadic DNA sequences conserved in evolution from protozoa to
man (Doolittle et al., 1989; Xiong and Eickbush, 1990). LINEs,
also known as type II retrotransposons, use self-encoded proteins to
reverse transcribe their own mRNA and integrate cDNA copies at new
locations in the genome. This hypothesis has been experimentally
supported by the analysis of transgenic flies carrying intron-marked Drosophila I factors (Pelisson et al., 1991; Jensen
and Heidmann, 1991) and baby hamster kidney cells transfected with
mouse LINE-1 elements (Evans and Palmiter, 1991).
Mammalian genomes
harbor >10 LINEs that belong to a single superfamily
(Singer and Skowronski, 1985; Hutchison et al., 1989). By
contrast, distinct LINE families, each including 50-80 members,
coexist in D. melanogaster. In addition to Doc, six other
families of LINE elements have been described so far in this organism,
including the I factor (Fawcett et al., 1986), F (Di Nocera
and Casari, 1987), G (Di Nocera, 1988), and jockey (Priimagi et
al., 1988) elements, and type I and type II ribosomal DNA
insertions (Jacubczak et al., 1990).
LINEs differ markedly from other mobile DNA sequences that also propagate by the retrotranscription of RNA intermediates such as copia-like elements in D. melanogaster (Finnegan and Fawcett, 1986) and the Ty element (Boeke et al., 1985) in Saccharomyces cerevisiae. These elements, also known as viral retrotransposons, resemble the integrated genomes of retroviruses as they carry LTRs. LINEs lack LTRs, and their expression is controlled by promoters which are located within the transcribed region (Mizrokhi et al., 1988; Swergold, 1990; Minchiotti and Di Nocera, 1991; Minakami et al., 1992; Contursi et al., 1993; McLean et al., 1993).
By means of transient transfection assays we monitored the
expression of constructs in which the reporter CAT gene was under the
control of various Doc DNA segments in Drosophila Schneider II
(S2) cells. We show that distinct cis-acting DNA elements, clustered in
a 50-bp long DNA region located at the 5` end of unit-length Doc
copies, cooperate to control RNA initiation. In addition, we found that
sequences located
200 bp downstream from the 5` end inhibit the
expression of the reporter CAT gene in a position- and
orientation-dependent manner. The inhibition appears to be due to
reduced translation rather than to impaired synthesis of CAT mRNA.
Doc DNA called
corresponds to a HinfI-NaeI fragment spanning
residues 219-267 of su(f)
.
was cloned
in the plasmid suA, either in the AvaI site upstream of the su(f)
promoter (constructs
1suA and
2suA), or in the HindIII site downstream from the su(f)
promoter (constructs suA
1 and
suA
2).
was also cloned into the HindIII site of
RSVCAT downstream from the RSV promoter (constructs RSV
1 and
RSV
2). The subregions
and
(see Fig. 2A), obtained by digestion of suN with DdeI and AvaI, were cloned in the HindIII
site of plasmid suA downstream from the su(f)
promoter (constructs suA
1 and suA
1, respectively).
RSV
1 and RSV
1 were constructed by replacing the HindIII-NcoI fragment spanning the 5` end region of
the CAT gene in RSVCAT with HincII-NcoI fragments
from suA
1 and suA
1. In all of these constructs, numbers
indicate if the DNA of interest has been cloned in the direct (1) or reverse (2) orientation. Plasmids s26 and
Doc6-21 were constructed by first annealing the complementary
oligonucleotides SA1/SA2 (s26) and 6A1/6A2 (Doc6-21). The
double-stranded products, which are blunt at one end and give SalI 5` plus-strand overhangs at the other end, were cloned
between the AvaI and SalI sites of pEMBL8CAT.
Plasmids s47 and Doc6-42 were similarly obtained by first
annealing the complementary oligonucleotides SB1/SB2 (s47) and 6B1/6B2
(Doc6-42). The double-stranded DNA fragments, which had SalI 5` minus-strand overhangs at one end and were blunt at
the other end, were cloned between the SalI and HindIII sites of s26 and Doc6-21, respectively. To
obtain s47 + 4 and Doc6-42 + 4, s47, and Doc6-42
DNAs were digested with SalI, and the SalI termini
were made blunt-ended by the Klenow enzyme. To construct s47 - 5,
s47 was digested with SalI, and SalI termini were
made blunt-ended with mung bean nuclease. Mutations were introduced in
the s47 context by annealing complementary oligonucleotides carrying
specific base changes. Double-stranded oligonucleotides were cloned
into s26 either between the SalI and HindIII sites
(s47a, oligonucleotides 47a1 and 47a2; s47b, oligonucleotides 47b1 and
47b2; s47c, oligonucleotides 47c1 and 47c2), or between the AvaI and SalI sites (s47in1; oligonucleotides 47in1a
and 47in1b; s47in2, oligonucleotides 47in2a and 47in2b). Constructs
s/Doc6 and Doc6/s were obtained by exchanging the SalI-NcoI fragments, which span the Doc DE1-DE3
array and the 5` end portion of the CAT gene, between s47 and
Doc6-42. Oligonucleotides used are as follows, listed in a 5` to
3` configuration: D132, GTTAGTTGTGCAAACTGCAC; D218, CATTGTTCTAAGTCCAC;
SA1, GAATTGATTCGGCATTCCACAG; SA2, TCGACTGTGGAATGCCGAATCAATTC; 6A1,
CACTCGTGGATTCGCAG; 6A2, TCGACTGCGAATCCACGAGTG; SB1,
TCGACGGGTGGAGACGTGTTTCTTT; SB2, AAAGAAACACGTCTCCACCCG; 6B1,
TCGACATTCGCGGACGTGTTTCTTT; 6B2, AAAGAAACACGTCCGCGAATG; 47a1,
TCGACTTCATGAGACGTGTTTCTTT; 47a2, AAAGAAACACGTCTCATGAAG; 47b1,
TCGACGGGTGGCTCATGCATTCTTT; 47b2, AAAGAATGCATGAGCCACCCG; 47c1,
TCGACGGGTGGAGACGTGTTGAAGA; 47c2, TCTTCAACACGTCTCCACCCG; 47in1a,
GAATTCGTGAGGCATTCCACAG; 47in1b, TCGACTGTGGAATGCCTCACGAATTC; 47in2a,
CCCGGGTGCCTCTTTCGGCATTCCACAG; and 47in2b, TCGACTGTGGAATGCCGAAAGAGGCAC.
In all cloning procedures, incompatible termini were made blunt-ended
by T4 DNA polymerase before ligation (Sambrook et al., 1989).
The orientation of cloned DNA was assessed in all recombinants by
nucleotide sequence analysis according to standard procedures (Hattori
and Sakaki, 1986).
Figure 2:
CAT expression is repressed by Doc
sequences in S2 cells. A, CAT activity detected in cells
transfected with suN, su-218, and su-132 is shown at the top.
The region and the
and
subregions, are diagrammed in
the middle. ATGs found within
,
, and
subregions are highlighted. The sequence of the region
208-267 of su(f)
is reported at
the bottom. Boxed residues correspond to
sequences. The DdeI I site at the boundary between the
and
subregions is underlined; ATGs triplets are in uppercase
letters. B, effect of
,
, and
sequences on CAT
expression directed by su(f)
and RSV
promoters. The orientation of
,
, and
in each
construct is denoted by an arrow. In this panel, as in Panel A, S2 cells, transfected with 5 µg of test plasmid
and 5 µg of the internal control
F-gal, were assayed for CAT
activity. Relative enzymatic activities are expressed as described in
the legend to Fig. 1. C, Northern analysis of suA
1
and suA
2 transcripts. Total RNA (30 µg) from S2 cells
co-transfected with 5 µg of
F-gal and 5 µg of either
suA
1 (lane 1) or suA
2 (lane 2) was analyzed
by Northern blot using as probes
P-labeled DNA fragments
spanning CAT and
-galactosidase sequences (see ``Materials
and Methods''). Bands corresponding to CAT transcripts are
indicated by an arrow. Upper hybridization bands correspond to
-galactosidase transcripts directed by
F-gal.
Figure 1:
Expression of Doc-CAT plasmids in S2
cells. Thick bars denote Doc DNA from the elements su(f), Doc11, and Doc6 flanking 5` the
CAT gene in the various constructs analyzed. Thin bars denote
flanking genomic DNA. The interval of Doc DNA included in each
construct is given. Restriction sites within Doc DNA used for cloning
are indicated. S2 cells, transfected with 5 µg of each of the
listed recombinants and 5 µg of the internal control plasmid
F-gal, were assayed for CAT activity. The amount of lysate used in
each assay was normalized to the expression of the cotransfected
F-gal plasmid. Numbers express relative enzymatic
activities, obtained by dividing the specific activities by the level
of suN expression, and represent average values of three to four
independent transfections carried out with different cell populations.
S.D. are indicated.
Figure 6:
Alignment of the 5` end regions of Doc, F,
I factor, and jockey elements. (C/G)ATTCG motifs in Doc6 and su(f) are underlined. Site(s)
of transcription initiation and the regions defined as DE1, DE2, and
DE3 are boxed. The regions of su(f)
and Doc6 in which SalI sites have been created are highlighted. References (in parentheses) are as follows: Doc6, su(f)
and Doc 11 (this work; Driver et al., 1989), F (Minchiotti and Di Nocera, 1991), I factor
(Fawcett et al., 1986), and jockey (Mizrokhi et al. 1988). Residues flanking the element Doc11 are in lowercase
letters.
An AluI site
conserved in the Doc 5` UTR was used to construct three clones (Fig. 1, suA, 11A, and 6A) in which
50 bp from the 5` end of each element preceded the CAT gene. These
constructs directed CAT expression at higher (5-8-fold) levels
than the parental ones, suN, 11N, and 6N (Fig. 1).
Information sufficient to direct transcription is thus restricted in Doc to a relatively small DNA interval. This contrasts what has been reported for the F element, in which basal transcription is stimulated by sequences located far downstream from the 5` end region (Contursi et al., 1993).
One interpretation of the data is that the region
219-267, which we will call herein , hosts a silencer-like
element. Silencers generally inhibit transcription in a position and
orientation independent manner (Laurenson and Rine, 1992).
,
cloned in either orientation upstream of a copy of the D.
melanogaster hsp70 promoter transcribing the CAT gene, did not
reduce CAT expression (data not shown). Subsequently,
was cloned
into the plasmid suA, either upstream (
1suA and
2suA) or
downstream (suA
1 and suA
2) of the su(f)
promoter. We found that CAT expression was inhibited only when
flanked 3`, in direct orientation, the su(f)
promoter (Fig. 2B). CAT levels were reduced,
again in an orientation-dependent fashion (control data not shown) in
cells transfected with RSV
1, a construct in which
had been
cloned between the CAT gene and the Rous sarcoma virus (RSV) promoter (Fig. 2B).
The position- and orientation-dependent
mode of action suggests that inhibitory sequences within act at
the RNA, rather than at the DNA level. This hypothesis is supported by
Northern blot analyses showing that suA
1 transcripts accumulated
at levels 2-3-fold higher than suA
2 transcripts (Fig. 2C). Northern data also ruled out that the
inhibition is associated to enhanced degradation of CAT transcripts.
Upstream AUG triplets may inhibit translation initiation at natural
AUGs (Liu et al., 1984), and an ATG is present within
(see Fig. 2A). However, by assaying derivatives of both
suA and RSVCAT in which the CAT gene is flanked 5` by either the
left-hand (
) or the right-hand (
) portion of
(Fig. 2A), we observed CAT inhibition with the
,
but not with the
subregion, which includes the
ATG (Fig. 2B). The
subregion extends 5` to residue
208 and carries an ATG, not present in
(Fig. 2A).
We cannot therefore rule out that data obtained with suA
1 and
RSV
1 reflect an inhibitory action of the
ATG, which is the
codon for the initiating methionine of the hypothetical Doc open
reading frame 1, on the translation of the CAT mRNA.
Figure 3:
Primer extension analysis of Doc-CAT RNAs.
Total RNA (40 µg) from S2 cells transfected with 11A, suA, or 6A
was hybridized to a P-5`-end-labeled 30-mer (CAT primer)
complementary to the CAT gene sense strand. Annealed primer moieties
were extended, in the presence of deoxynucleoside triphosphates, by
avian reverse transcriptase. Reaction products were run on a 6%
acrylamide, 8 M urea gel, along with sequencing ladders of the
11A, suA, and 6A templates obtained by the dideoxy chain termination
method using the CAT primer. Predominant RNA start sites, marked by arrows, are shown along with the DNA sequence of the 5` end of
each element at the bottom. Lowercase letters denote flanking
DNA; target sites setting the 5` boundaries of su(f)
and Doc6 are underlined.
Residue 2 in the su(f)
element is A and
not G as previously reported (Driver et al.,
1989).
The
organization of the cis-acting elements involved in the control of Doc
transcription was investigated by assaying plasmids in which the CAT
gene is under the control of different versions of the su(f) promoter region. A construct carrying the
interval 1-26 directed CAT expression, but could not drive
faithful RNA initiation (Fig. 4, construct s26). A
10-fold increase in CAT levels, and a correct transcription pattern,
was observed by using as template a construct carrying the interval
1-47 (Fig. 4, s47). In s26 and s47, the
dinucleotide TT, found in su(f)
at residues
25-26, is replaced by GA. The change, which created in s47 a SalI site between residues 22 and 27 (Fig. 4A), did not perturb promoter function, as judged
by comparing the expression of suA and s47 (data not shown). By
repairing the termini of s47 DNA after SalI cleavage with
either the Klenow enzyme or the mung bean nuclease, we obtained
constructs in which the interval 27-47 of su(f)
was moved either 4 bp away from (s47 + 4) or 5 bp closer to
(s47 - 5) the RNA start site(s). Both space changes severely
reduced CAT expression (Fig. 4A). Primer extension data
showed that faithful RNA initiation was abolished in the construct s47
+ 4 (Fig. 4B).
Figure 4:
A,
effect of base and space changes within the su(f) promoter. The sequence of the interval 1-47 from su(f)
flanking the CAT gene in s47 is
shown at the top. Lowercase letters denote residues mutated to
create a SalI site. Sequence identities are denoted by dashes in the clones listed below. Residues deleted in s47
- 5 are denoted by asterisks. The site of insertion of
extra nucleotides in s47 + 4 is marked by an arrow. The
RNA start site region and the RGAGCTGY motif are boxed. Sites
of RNA initiation at residues 6 and 7 and the SalI site
between residues 22 and 27 are underlined. The regions denoted
in the text as DE1, DE2, and DE3 are indicated. Relative CAT activities
and S.D. were calculated as described in the legend to Fig. 1. B, transcription initiation in different su(f)
derivatives. Total RNA (30 µg)
from S2 cells transfected with the constructs listed below was analyzed
by primer extension as described in the legend to Fig. 3. Lanes: 1, s26; 2, s47; 3, s47 + 4; 4, s47; 5, s47a; 6, s47b; 7, s47c.
Sequencing ladders of either s26 (lane R, G + A
reactions; lane Y, C + T reactions) or s47 (lanes
G, A, T, and C) were obtained by the
dideoxy chain termination method with the same 30 mer (CAT primer) used
for RNA extension. Taking into account the length of the vector DNA
segment separating Doc from CAT sequences in the construct s26, bands
in lane 1 corresponding to faithfully initiated transcripts
should be 7-8 nucleotides shorter than the doublet detected in lane 2.
The interval 25-47 spans a
DNA sequence (AGACGTGT, residues 33-42; see Fig. 4A) which is conserved in all Drosophila LINEs (consensus RGACGTGY; see Fig. 6). The transcriptional
pattern of the construct 47b indicates that this sequence has a key
role in Doc transcription (Fig. 4, A and B).
The reduced template activity of both s47 + 4 and s47 - 5
may be due to the novel position of the RGACGTGY motif, which is
located in all Drosophila LINEs at the same distance from the
RNA start site(s) (see Fig. 6). However, the situation is more
complex, since we found that base changes introduced either to the left
(residues 28-32, construct s47a) or to the right of the AGACGTGT
sequence (residues 43-47, construct s47c) also impaired promoter
function (Fig. 4, A and B). Results thus
indicate that three adjacent downstream sequences, herein called DE1,
DE2, and DE3 (Fig. 4A) stimulate transcription of the su(f) element.
Sequences spanning the RNA
start sites also have a critical role in transcriptional promotion. The
RNA start site regions of su(f) (GATTCG, residues
6-11) and Doc6 (CACTCG, residues 1-6) fit the consensus
(C/G)AYTCG (RNA start sites are underlined; see Fig. 3). The
notion that the (C/G)AYTCG motif spans an initiator (Inr) module is
supported by the analysis of the construct 47in1. This clone, in which
residues 6-10 of su(f)
have been changed
from GATTC to cgTga, directed CAT expression
70-fold less
efficiently than the parental construct s47 (Fig. 4A).
We have also constructed a derivative of s47 (s47in2) in which residues
1-7 of su(f)
have been replaced by the
heptamer TGCCTCT, which is the sequence found at the same relative
position in Doc11 (see Fig. 6). Base changes introduced in
s47in2 did not impair CAT expression (Fig. 4A). This
result suggests that the template activity of Doc11 does not correlate
with base changes in the Inr region, but possibly reflects the negative
interference of flanking genomic DNA.
The organization of promoter
sequences is similar, on the whole, in su(f) and
Doc6 (Fig. 5). The construct Doc6-21, which carries the
interval 1-21 of Doc6, directed CAT expression
50-fold less
efficiently than Doc6-42, a construct in which Doc6 DNA extended
3` to residue 42 (Fig. 5). Faithful RNA initiation was observed
with Doc6-42, but not with Doc6-21 (data not shown). By
mutating residues 17 and 22 of Doc6, a SalI site was also
introduced into Doc6-42. SalI sites in s47 and
Doc6-42 are at the same distance from the RNA start site region (Fig. 5). Similarly to what was observed with the s47 + 4
construct (Fig. 4), the insertion of 4 bp in the Doc6 promoter
region significantly reduced CAT expression (Fig. 5, construct Doc6-42 + 4) and abolished faithful RNA initiation
(data not shown). The DNA region dislodged in Doc6-42 + 4
corresponds to the DE1-DE3 array. Interestingly, while DE2 (with
a single base pair change) and DE3 are conserved in Doc6 and su(f)
, the DE1 region varies ( Fig. 5and Fig. 6). Comparison of the template activities of Doc6-42,
s47, and two chimeric clones (Doc6/s and s/Doc6) in which regulatory
sequences of Doc6 and su(f)
have been exchanged,
suggests that the two DE1-DE3 arrays are functionally equivalent (Fig. 5).
Figure 5:
Analysis of the Doc6 promoter. Residues
1-42 from Doc6 flanking the CAT gene in Doc6-42 are shown
at the top. In the clones listed below, sequence identities
are denoted by dashes. Residues 1-47 from su(f) flanking 5` the CAT gene in s47
are shown at the bottom. In Doc6/s and s/Doc6, residues from
s47 are in uppercase letters. The SalI sites in
Doc6-42 and s47, underlined in the figure, have been
exploited to exchange Doc6 and su(f)
regulatory sequences in constructs Doc6/s and s/Doc6 (see
``Materials and Methods''). Sites of RNA initiation in Doc6
and su(f)
are boxed. The site
of insertion of extra nucleotides in Doc6-42 + 4 is marked
by an arrow. The regions denoted as DE1, DE2, and DE3 in su(f)
are shown. Relative CAT activities
and S.D. were calculated as described in the legend to Fig. 1.
From these results we conclude that the 5` end region of Doc elements lacks an antisense promoter. CAT activity directed by the construct suN-inv likely results from the translation of heterogeneous transcripts initiated within vector sequences.
In many polII genes, transcriptional signals are located downstream from the CAP site (Ayer and Dynan, 1988; Soeller et al., 1988; Perkins et al., 1988; Thummel, 1989; Nakatani et al., 1990; Hariharan et al., 1991; Fridell and Searles, 1992). LINEs represent an interesting system for studying the nature and organization of intragenic promoters, because their expression is regulated by DNA elements that are largely internal to the transcriptional unit. Data presented in this work add knowledge to the picture emerging from analyses carried out with other Drosophila LINEs in many respects.
In spite of significant
sequence divergence, the 5` end regions of su(f) and Doc6, two ``full-length'' members of the Drosophila Doc LINE family, directed CAT expression at
comparable levels in S2 cells (Fig. 1). Transcription is
initiated in the two elements within similar DNA tracts fitting the
consensus (C/G)AYTCG (Fig. 3). A related motif, CATTCG, is found
at the 5` end of Drosophila jockey elements (see Fig. 6), and its deletion abolished jockey-dependent expression
in S2 cells (Mizrokhi et al., 1988). All of these hexamers
share an A at the RNA start site flanked 3` by pyrimidines, a sequence
which is the core motif in many Inrs (Weis and Reinberg, 1992; Jahavery et al., 1994). The notion that the (C/G)AYTCG motif overlaps a
Inr module is supported by the analysis of the site-directed mutant
47in1 (Fig. 4A).
Doc11 is similar to su(f) at the 5` end but lacks residues 1-7 (Fig. 6). The inability of Doc11 sequences to drive faithful and
efficient transcription ( Fig. 1and Fig. 3) is not due to
changes in the RNA start site region (see construct 47in2 in Fig. 4A), and possibly reflects the negative action of
flanking genomic DNA.
Three copies of the sequence (C/G)ATTCG, and
one copy of the sequence CATTCC, flank 3` the RNA start site regions of
Doc6 and su(f), respectively (Fig. 6).
None of these repeats primes transcription (Fig. 3). Selectivity
is not dictated by base content, since the sequence GATTCG is found
both in su(f)
and Doc6, but transcription is
initiated within the su(f)
hexamer only. We favor
the hypothesis that a functional hierarchy is imposed by the nature of
the trans-acting factors interacting with Doc promoters. According to
this view, sequence redundancy may facilitate the binding of a protein
to the Doc 5` end. In transcriptionally competent complexes, however,
such a protein would be able to recognize only one target as Inr
because of stereospecific interactions with one or more factors bound
to downstream promoter elements. The fidelity and the efficiency of
transcription, both in su(f)
and Doc6, is
controlled by a DNA region located
20 bp downstream from the RNA
start sites ( Fig. 4and Fig. 5). Sequences found at
nearly the same place within the 5` UTR of the HIV-1 enhance the
activity of the HIV-1 Inr in a distance independent fashion
(Zenzie-Gregory et al., 1993). In contrast, the distance
between the Inr and 3` flanking sequences is critical in the Drosophila mdg1 (Arkhipova and Ilyin, 1991) and the adenovirus
IVa
(Chen et al., 1994) promoters. Space changes
are similarly not tolerated in the Doc promoter ( Fig. 4and Fig. 5). Although it cannot be formally excluded that negatively
acting sequences had been inserted (or created) both in s47 + 4
and Doc6-42 + 4, and crucial ones had been deleted in s47
- 5, we believe that transcription in all of these constructs is
inhibited by loss of protein-protein interactions due to the
misalignment of promoter modules. Site-directed mutations introduced in
the su(f)
promoter revealed that multiple DNA
elements located downstream from the Inr stimulate transcription. The
key element is a DNA sequence (RGACGTGY motif or DE2) which is
conserved in all Drosophila LINEs at a fixed distance from the
RNA start site(s) (Fig. 6). Similarly to what has been reported
for jockey (Mizrokhi and Mazo, 1990) and F (Contursi et al.,
1993) elements, DE2 is absolutely required for Doc transcription (Fig. 4). To a less but significant extent, base changes
introduced within the adjacent DE1 and DE3 regions also reduced CAT
expression (Fig. 4). This finding is novel and suggests that the
functional organization of Drosophila LINE promoters may be
more complex than predicted from previous transient transfection
assays. Sequences similar to DE1 or DE3 are found neither in other
LINEs (Fig. 6) nor in Antennapedia and engrailed, two Drosophila genes in which
transcription is also regulated by RGACGTGY motifs located in the 5`
UTR (Soeller et al., 1988; Perkins et al., 1988).
Since the analysis of Doc6 and su(f)
chimeric
constructs established that DE1-DE3 arrays are functionally
equivalent (Fig. 5), the peculiar substitution of DE1 sequences
in Doc6 with a copy of the RNA start site region (Fig. 6) leads
to speculate that the same protein may bind both to the Inr and the DE1
region. This hypothesis is supported by knowledge that some Inrs are
recognized by transcription factors with multiple binding specificities
(Seto et al., 1991; Du et al., 1993). On the other
hand, it cannot be ruled out that two distinct proteins interact with
the Inr and DE1 in the su(f)
promoter.
Sequences analogous to DE1 and DE3 are plausibly present in other
LINEs. In the I factor, an Inr-like element spanning residues 1-4
is flanked 3` by a RGACGTGY motif spanning residues 29-36 (Fig. 6). In contrast with data reported in this work, deletion
of the regions corresponding to DE2 and DE3 in Doc reduced only 2-fold
the transcription of the I factor (McLean et al., 1993). This
finding, and the inability of the I factor region 1-20 to direct
faithful transcription in S2 cells, ()favor the notion that
in this LINE sequences crucial for transcription are located in the
region corresponding to DE1 in Doc. The relative contribution of
downstream cis-acting elements to transcriptional promotion may thus
vary among Drosophila LINEs, and plausibly rely, because of
the heterogeneity of DE1 and DE3 regions, on the recruitment of
distinct trans-acting factors.
Significant similarities between the 5` UTRs of Doc and other Drosophila LINEs are restricted to matches in the promoter region. Therefore it is not surprising that Doc, although closely related to the F element at the gene products level (O'Hare et al., 1991), lacks an antisense promoter.
Selective accumulation of LINE transcripts in specific
developmental stages or cell types has been described (Mizrokhi et
al., 1988; Chaboissier et al., 1990; Lachaume et
al., 1992; Martin and Branciforte, 1993; Minchiotti et
al., 1994). Less is known, however, about the synthesis of LINE
proteins (McMillan and Singer, 1993; Branciforte and Martin, 1994;
Trelogan and Martin, 1995). Data reported in this study suggest that
the expression of Doc elements may be regulated at the gene products
level. Sequences found at the boundary between the 5` UTR and the open
reading frame 1 region inhibited CAT expression in an orientation- and
position-dependent manner (Fig. 2, A and B).
Northern RNA blotting data (Fig. 2C) substantiated the
hypothesis that inhibitory sequences act at a post-transcriptional
level, and ruled out that reduced CAT production is due to enhanced
degradation of CAT mRNA. We cannot exclude that ATG triplets present
within and
regions inhibit CAT expression by impairing
translation of the CAT mRNA. This hypothesis is, however, weakened by
results obtained with the Doc
subregion (Fig. 2B). We favor the hypothesis that, by interacting
with a cellular component, transcripts carrying
(or
)
sequences are somehow compartmentalized, thereby reducing their
translation. The effect of inhibitory sequences was more pronounced in
transcripts driven by the RSV promoter (Fig. 2B). This
may reflect differences in the folding of inhibitory sequences and
consequently in the ability to interact with a cellular component,
within distinct RNAs.
A DNA region exhibiting 70% homology to
the
-
interval is found in the F element between residues 184
and 243. A 3` deletion derivative lacking this region (Fin3` +
175) is transcribed
10-fold less efficiently than F-cat1, a
construct in which F DNA extended 3` to residue 267 (Minchiotti and Di
Nocera, 1991). This result is correlated to the removal of sequences
located between 193 and 207 which stimulate F basal transcription
(Contursi et al., 1993). However, Fin3` + 175 directs CAT
expression only 2-fold less efficiently than F-cat1. (
)This
observation indirectly suggests that inhibitory sequences may be
conserved in F. Germ line transformation experiments will eventually
clarify whether this type of post-transcriptional regulation operates in vivo and if the inhibition is enhanced (or relieved) in
specific cell types.
At the moment, genetic conditions which trigger transposition are known only for the Drosophila LINE I factor (Bucheton, 1990). In this respect, the isolation of a fly stock in which members of the Doc family transpose at a high rate (Pasyukova and Nazhdin, 1993) provides the base for analyses aimed at the characterization of the mechanisms that control the expression and mobilization of the Doc retrotransposon in the organism.