(Received for publication, September 13, 1996, and in revised form, November 18, 1996)
From the Intracisternal A-particle (IAP) sequences are
endogenous retrovirus-like elements present at 1,000 copies in the
mouse genome. We had previously identified IAP-related transcripts of
unusual size (6 and 10 kilobases (kb)), which are observed exclusively in the liver of the aging mouse. In this report, using cDNA
libraries that we have constructed from the liver mRNAs of an aged
DBA/2 mouse, we have cloned and entirely sequenced the corresponding cDNAs. Both are initiated within the 5 Intracisternal A-particle (IAP)1
sequences are moderately reiterated transposable elements
(approximately 1000 copies in the mouse genome), which are closely
related to retroviruses and transpose via the reverse transcription of
an RNA intermediate (1, 2). These sequences, 3.5-7.2 kb long, have
been classified into different subgroups, each represented by a few
hundred copies (3, 4), depending on the presence of various internal
deletions (class I and I The in vivo pattern of expression of IAPs is complex, owing
to the large number of these elements in the genome and to possible position effects. In an extensive in vivo analysis of IAP
expression in transgenic mice, using LTR-driven reporter genes, we have
previously identified a specific "niche" for expression of these
elements in the stem cells of the germline (6). Expression of the IAP transgenes in these cells was found to be position-independent, and
should therefore reflect an intrinsic property of these elements. It is
consistent with a possible role in adaptive processes, since transposition specifically in the germline should result in inherited mutations. However, global Northern blot analysis of endogenous IAP
expression using appropriate probes have also revealed IAP transcripts
in several mouse somatic tissues with, for instance, maximum expression
in the thymus (reviewed in Refs. 1 and 7). The occurrence of such
transcripts is intriguing, as expression of the IAP transgenes was not
detected in these tissues, and a plausible interpretation could be that
they simply originate from a limited number of IAP elements, which
would be embedded, for instance, within genomic domains specifically
activated in the corresponding tissues. An interesting situation was
previously unraveled in an extensive analysis of IAP transcripts in the
organs of aging mice (7). The rationale of this study was that aging is
a process associated with cellular dysfunctions, for which it can be
hypothesized (among other theories) that they result from an
accumulation of somatic mutations (reviewed in Refs. 8-10). These
could be triggered by transposable elements, acting as insertion mutagens. In this respect, we had actually revealed IAP transcripts of
abnormal size (6 and 10 kb) which are induced at least 30-fold in the
liver of aged mice, whatever the mouse strain tested. A similar
induction of gene transcription has only been rarely described in
relation with aging, especially for potentially mutagenic transposable elements, and it was therefore of interest to characterize the IAP
elements associated with these transcripts (i) to determine whether
they correspond to "functional" IAP sequences, and (ii) to
characterize the mechanisms of their induction upon aging. We have
therefore constructed cDNA libraries with the RNAs from the liver
of aged mice, and in this report we demonstrate that the IAP
transcripts induced with age in fact originate from a single locus that
we have cloned, and correspond to a read-through from a non-coding IAP
element into the adjacent cellular DNA, which further contains an open
reading frame with homology to a yeast transcription factor. These
results demonstrate the importance of position effects on the
expression of highly reiterated elements, and identify a locus that
should now allow the characterization of domains or genes possibly
directly associated with the aging process.
Mice from four different inbred strains (C57BL/6,
C57BL/10, BALB/c, and DBA/2) were obtained from Iffa-Credo
Laboratories; they were housed individually in our animal facilities,
where they were fed ad libitum. All mice were killed by
cervical dislocation and had no evident pathology at the time of
death.
PCRs were performed with a Hybaid thermal cycler. 1 ng
of plasmid DNA, 100 ng of genomic DNA, or 5 µl of phage stock
(previously submitted to three cycles of liquid nitrogen freezing and
thawing) were amplified with 1 unit of Taq DNA polymerase in
the buffer supplied by the manufacturer (Amersham). Initial
denaturation was for 5 min at 95 °C, followed by 30 cycles at
94 °C for 1 min, 55 °C for 1 min, and 72 °C for 1 min. Final
extension was at 72 °C for 10 min. PCR products were gel-purified
using the freeze-thaw phenol extraction method (11),
ethanol-precipitated, and resuspended in 20 µl of TE buffer. PCR
products were then either used directly as a probe, or cloned after
Klenow enzyme treatment into the EcoRV site of the
pBluescript vector (Stratagene).
The following primers were synthesized: IAPE1,
5 DNA probes were labeled by random priming using
[32P]dCTP and a commercial kit (Amersham). IAP DNA probes
were obtained by digestion of plasmid pIL3, a pBluescript SK plasmid
containing IAP-IL3 (12). Nucleotide refers to the sequence published in
Ref. 12. Fragments were extracted using the freeze-thaw phenol
extraction method (11). The almost full-length IAP probe is a 4.5-kb
XmnI-MluI fragment (nt 602-5081), probe IAP-5 Total cellular
mRNAs from various tissues were extracted using the guanidium
isothiocyanate extraction method (15). Poly(A)+ RNA was
isolated from total RNA using a pre-packed spun column containing
oligo(dT)-cellulose (Pharmacia Biotech Inc.). For Northern blot
analysis, 10 µg of total RNA/lane were fractionated on
agarose/formaldehyde gels. RNAs were transferred to a nylon membrane
(Hybond N, Amersham) in 0.15 M NH4Ac buffer and
hybridized with riboprobes or DNA probes. Loading of equal amounts of
RNA in each lane was assessed with the BET-stained ribosomal RNAs upon
UV illumination of the membrane. Prehybridization and hybridization
were performed in 50% formamide with 0.75 M NaCl, 1% SDS,
50 µg/ml salmon sperm DNA, and 5% dextran sulfate when using
riboprobes, or with 7% SDS, 1 mM EDTA, 0.5 M
NaHPO4, pH 7 (16), when using DNA probes. Hybridized blots were first rinsed in 2 × SSC at room temperature and then washed at 65 °C (time length and SSC concentration varied depending on the
probe; see figures). Filters were then exposed for at least 24 h.
High molecular
weight DNAs were extracted from different tissues, which were first
crushed on dry ice and then incubated overnight at 55 °C in a
solution containing 50 mM Tris-HCl, pH 7.5, 0.1 M NaCl, 25 mM EDTA, 1% SDS, and 150 µg of
proteinase K. DNA was phenol-chloroform-extracted,
ethanol-precipitated, washed twice in 70% ethanol, dried under vacuum,
and resuspended in TE buffer. DNA digestion was carried out with a
4-6-fold enzyme excess. Digested DNA was electrophoresed on 0.8%
agarose gels, transferred to nylon membrane (Hybond N, Amersham), and
cross-linked to the membrane using a Stratalinker apparatus
(Stratagene). Hybridization and washing were carried out as described
for Northern blot analysis.
Two cDNA
libraries (one oligo(dT)-primed and one random-primed) were constructed
using the Positive In
situ hybridization experiments were carried out using metaphase
spreads from a WMP male mouse, in which all the autosomes except 19 were in the form of metacentric robertsonian translocations. Concavalin
A-stimulated lymphocytes were cultured at 37 °C for 72 h with
5-bromodeoxyuridine added for the final 6 h of culture (60 µg/ml
of medium), to ensure a chromosomal R-banding of good quality. The
pB2-16 clone containing a 2-kb insert in pBluescript was
tritium-labeled by nick-translation to a specific activity of 2 × 108 dpm/µg. The radiolabeled probe was hybridized to
metaphase spreads at final concentration of 25 ng/ml hybridization
solution as described previously (18). After coating with nuclear track
emulsion (Kodak NTB2), the slides were exposed for 20 days at 4 °C,
then developed. To avoid any slipping of silver grains during the
banding procedure, chromosome spreads were first stained with buffered
Giemsa solution and metaphases photographed. R-banding was then
performed by the fluorochrome-photolysis-Giemsa method and metaphases
rephotographed before analysis.
As
previously demonstrated and illustrated in Fig.
1A, Northern blot analysis, using an almost
full-length IAP probe, of RNAs from the liver of aged mice reveals 10- and 6-kb transcripts not observed in the young mouse. These IAP
Probing Northern blots of liver mRNA from old mice with
single-stranded IAP riboprobes (data not shown) first demonstrated that
both IAP-AR transcripts are transcribed from the sense strand, as
expected for normal IAP transcripts. Taking into account the abnormal
length of the IAP-AR1 transcript, we then questioned whether it could
correspond to an env-containing sequence; actually, full-length IAP transcripts characterized to date (i.e. 7.2 kb) all contain a truncated env gene, but Reuss and Schaller
(13) recently cloned a 4-kb cDNA coding for a related full-length
envelope. As shown in Fig. 1B, hybridization with an
env probe failed to detect any of the two IAP-AR transcripts
even under prolonged exposure (only a minor transcript of 1.45 kb could
be detected, possibly corresponding to an alternative splice of the
previously characterized env transcript in Ref. 13).
Finally, Northern blots were hybridized with a series of probes
encompassing different IAP sub-domains (Fig. 1, B and
E). Hybridization with an internal probe immediately
adjacent to the 5 To
characterize further the IAP-AR transcripts, we constructed two
cDNA libraries: a randomly primed cDNA library and an
oligo(dT)-primed one. In both cases cDNAs were synthesized with
poly(A)+ mRNAs from the liver of an 18-month-old DBA/2
mouse, as highest expression of both transcripts was observed in this
strain (7). According to the analysis above, no IAP probe could be used
for a direct selective screening of the IAP-AR versus normal
IAP transcripts. We therefore decided to clone the IAP-AR1 first,
assuming that it corresponds to an IAP read-through transcript (see
above), and even more precisely to a 3
The strategy for cloning the 6-kb IAP-AR2 and the 3-kb transcripts was
then rather straightforward, taking into account that both are positive
for the XH probe. According to the data in Fig. 1, IAP-AR2 should be
positive for the IAP-pol probes but, unlike IAP-AR1, should be negative
for probe IAP-3 The IAP-AR1
cDNA discloses, within its 5 The IAP sequence within IAP-AR1 is then followed by a non-coding
sequence containing 4 B1 and 1 B2 repeats. These are highly reiterated
mobile elements of the mouse genome classically found in intronic
domains, and which are closely related to the human Alu sequences (21,
22). All repeats except one are in an opposite transcriptional
orientation as compared to IAP-AR1. The first B1 repeat is immediately
adjacent to the IAP-3 Analysis of the 6-kb IAP-AR2 and 3-kb transcripts (clones pB2-12 and
pB1-19) reveals the existence of two splicing events (Figs. 2 and 4).
The first splicing event may be described as a gag (nt
378)-to-env (nt 3413) splice, which is almost similar to the
gag-to-env splicing event previously described in
Ref. 24. In both cases, the same splice acceptor (SA1, Fig.
4A) was used. Interestingly, the
corresponding region is extremely well conserved among the different
IAP elements described to date. However, the splice donor (SD1, Fig.
4A) is slightly divergent from the one described in Ref. 24.
This may be related to a lesser sequence conservation observed in the
corresponding region among IAP elements.
The second splicing event may be described as an env (SD2,
nt 3572)-to-ORF region (SA2, nt 7528). Clone pB2-12 exhibited the same
env-to-ORF splicing event as clone pB1-19; however, the
SD2* site (nt 3577) used in the former was 5 bp apart from SD2 used in
the latter (Fig. 4B). To ascertain that this "intronic"
region is absent from both the IAP-AR2 and 3-kb transcripts, we used probes corresponding to this domain (probes A, B, and C) or to the 3 The 3
In parallel, using the blastX program (University of Wisconsin Genetics
Computer Group), which translates both strands in all six reading
frames, we searched for similarities against the entire NBRF and
Swissprot protein data bases and found a significant homology to the 3
Finally, as illustrated in Fig. 7A, a Southern blot
analysis of mouse genomic DNA restricted with three different enzymes detected essentially single bands with the ORF-containing pB2-16 DNA
probe (two bands are observed with PstI, but this is
expected from the nucleotide sequence). It strongly suggests that the
IAP-AR ORF is part of a single copy gene, and might then represent an exon of the mouse homologue of the yeast CCR4 gene. A
chromosomal localization of the corresponding locus in the mouse was
finally performed by in situ hybridization using the same
probe. As expected from the Southern blot analysis presented above, a
single hot spot was identified, which mapped on chromosome 3 in the
B-D region (Fig. 7B).
We have characterized and cloned the two IAP-related transcripts
specifically expressed in the liver of the aged mouse. We previously
demonstrated that expression of these two abnormally sized transcripts,
called IAP-AR1 and -AR2 (7), is a common feature of all four different
mouse strains tested. Sequence analysis now shows that both RNAs are
chimeric transcripts from a canonical type I The presently
identified locus combines two previously identified situations
resulting in chimeric transcripts: (i) transcriptional read-through,
and (ii) alternative splicing. Transcriptional read-through has
previously been documented for IAPs (19, 24), as well as for other
retrotransposons (e.g. the HERV-R human endogenous retrovirus; Refs. 26 and 27). In some cases read-through results from
deletions or mutations within the polyadenylation signal-containing 3 The other process resulting in chimeric transcripts involves a
subjugation of the splice sites, which are present in most retroid
elements, and alternative splicing. Leslie et al. (24) described two cases of cellular gene activation generated by an IAP
gag-to-IL3 and IAP
gag-to-GM-CSF splicing event in WEHI-3B derived
cells, and similar events have also been reported with other
retrotransposable elements, such as the human HERV-H elements resulting
in gag-to-calbindin and
gag-to-phospholipase A2-like chimeric
transcripts (29, 30), as well as with infectious retroviruses, such as
the Moloney murine leukemia virus or the avian leukosis virus (31-34)
resulting in gag-to-c-myb and gag- or
env-to-c-erbB chimeric transcripts. This
illustrates the extent to which retroid elements can be recruited and
participate to genome evolution, and emphasizes the close relationship
between retroviruses and endogenous retrotransposons. In the present
study, functional donor splice sites located in two distinct domains of
the IAP sequence have been demonstrated: SD1, which is located in the
IAP gag region close to (but distinct from) the site
described in Ref. 24, and the two 5-bp-apart SD2 and SD2* sites which are located at the 5 This
study also demonstrates, de facto, the importance of
position effects for the expression of IAP elements. Actually, we have
shown that the IAP-AR transcripts, which account for >50% of the
IAP-related transcripts in the liver of the aged mice, unambiguously
originate from a single IAP element, in a definite genomic locus that
we have mapped. Furthermore, the identified IAP has no special
features; its sequencing has revealed that it is a type I Several theories have been proposed
to account for the aging process (reviewed in Refs. 8-10).
Deterministic theories have proposed that aging is a programmed
process, whereas stochastic theories have involved accumulation of
errors, of various origin. In the latter case, activation of
transposable elements and endogenous retroviruses have been invoked and
in some cases documented (35, 36) as a possible event participating in
somatic mutations. Along the same line, demethylation upon aging of the
LINE murine mobile elements (and to some extent of the IAPs) has been
reported (37, 38). In fact, the present study (see also Refs. 7 and 39)
clearly shows that there is not a generalized de-repression of IAP
expression upon aging, which would result in an enhanced IAP-mediated
mutagenesis; IAP induction is limited to a single element, which
further is a defective I The reasons for the specific induction of the IAP-AR transcripts with
aging are still unknown. Spreading of an inactive chromatin conformation from a methylated domain to an adjacent non-methylated region was previously shown to inhibit the expression of a reporter gene in mammalian cells (43), and position effect variegation involving
heterochromatin spreading and resulting in gene silencing has been well
characterized in Drosophila (44). By analogy, the
age-dependent induction of the IAP-AR transcripts could
therefore be associated with the spreading of an active chromatin
conformation from an adjacent gene that would be specifically expressed
upon aging. It could also be more directly associated with the presence of enhancer sequences responsive to age-dependent factors
(e.g. the ubiquitous Age-Dependent Factor of the
rat; Ref. 45). Whatever the case, the identification of the present
locus should allow the characterization of age-responsive elements
and/or genes, and therefore be important for the study of the aging
process in concrete and molecular terms.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U70139[GenBank]. We are grateful to L. Cavarec and C. Delaporte for their help in cDNA libraries construction, D. Depetris for technical assistance for the in situ
hybridizations, and C. Lavialle and M. Gunther for helpful discussions
and critical reading of the manuscript.
Unité de Physicochimie et
Pharmacologie des Macromolécules Biologiques,
Unité de Génétique Médicale et
Développement,
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
Acknowledgments
REFERENCES
long terminal repeat of a
type I
1 IAP sequence, and correspond to a read-through into a unique
flanking cellular sequence containing a 966-nucleotide open reading
frame, located 3
to the IAP sequence. The 6-kb IAP-related transcript
corresponds to a post-transcriptional modification of the 10-kb
mRNA, and is generated by a splicing event with the donor site in
the IAP sequence, and the acceptor site 5
to the open reading frame.
This open reading frame is located on chromosome 3, is evolutionarily
conserved, and discloses significant similarity to the yeast CCR4
transcription factor at the amino acid level. The specific expression
of these age-induced transcripts, which account for more than 50% of
the IAP-related transcripts in the liver of old mice, is therefore
entirely consistent with the induction of a single genomic locus, thus
strengthening the importance of position effects for the expression of
transposable elements. Characterization of this locus should now allow
studies on its chromatin and methylation status, and on the
"molecular factors of senescence" possibly involved in its
induction.
1-4) and of one specific 0.3-kb insertion
(class IIA-C). Like retroviruses, they contain two long terminal
repeats (LTRs) with a U3-R-U5 organization, bordering
gag-pol-like open reading frames. They are not infectious,
and as expected disclose a severely truncated envelope gene. The LTRs
contain the signals for the initiation and regulation of transcription
(5
-LTR) and for the polyadenylation of the transcripts (3
-LTR). At
least some of the IAP elements should still be functional as
transposition events can be detected, either directly (2) or indirectly
via the mutations that their transposition provoke, especially in tumor cells in culture where these elements are heavily transcribed and act
as mutagenic agents (reviewed in Refs. 1 and 5).
Animals
-CCAATGCCAGTTCGCCATGATGCT; IAPE3, 5
-AGCTGCACTGGCTGAAATTGCTATGA;
AR-B11, 5
-TTGGACAATCAACTCTGAGTTCTC; AR-B12,
5
-GAATACACACCACTTACTGAGTCC; AR-B31 = 5
-GGATACACCTGACTACACCAATTG;
AR-B32, 5
-TTGTGAGTGTACCAGTGTGTGCAG; AR-B51, 5
-ACAAACTTGGGAGAAGGCTGTG;
AR-B52, 5
-AGAAACAGTCCCATAGCCAGCTG; AR-UTR31,
5
-GCAACTTCCTATCCTAGTTCAG; and AR-UTR32,
5
-TTCATCTGTGTACCATCCTCTACC.
is a 280-bp MluI-XmnI fragment (nt 323-602),
probe IAP-3
is a 480-bp AseI-AseI fragment (nt 4234-4711), probe IAP-LTR is a 370-bp AseI-MluI
fragment (nt 4711-5081), probe IAP-pol1 is a 580-bp
BamHI-XbaI fragment (nt 2116-2694), and probe
IAP-pol2 is a 820-bp XbaI-BamHI fragment (nt
2694-3521); the latter two probes allow a first discrimination between
the different IAP subtypes, as IAP-pol1 and IAP-pol2 are both absent in
IAP type IIB, I
2, I
3, and I
4, and both present in IAP type IIA, I
1, and I, whereas IAP type IIC contains IAP-pol2 but not IAP-pol1 (1). The IAP-env probe (nt 1782-2803, referring to the
sequence in Ref. 13) was obtained by PCR amplification using primers
IAPE1 and IAPE3 and mouse genomic DNA. The resulting PCR fragment
(assayed by nucleotide sequencing) was inserted at the EcoRV
site of pBluescript (Stratagene), from which it was then excised as a
XbaI-SpeI fragment. Probe XH is a 300-bp
XhoI-HindIII fragment from plasmid pA3-13 (see
"Results"). Probes A, B, C, and UTR were generated by PCR
amplification of mouse genomic DNA using, respectively, primers AR-B11
and AR-B12, AR-B31 and AR-B32, AR-B51 and AR-B52, AR-UTR31 and
AR-UTR32. IAP riboprobes, sense and antisense, were obtained by
in vitro transcription of linearized plasmid pIL3 with T7
and T3 polymerases. Reactions were performed using
[32P]UTP as described in Ref. 14.
Zap Express cDNA synthesis kit (Stratagene), following
the manufacturer's instructions. Poly(A)+ RNA from an
18-month-old DBA/2 mouse liver was used for first-strand cDNA
synthesis, using either poly(dT)-unidirectional primers or random-unidirectional linker primers, which both contain a
XhoI site (Stratagene). Second-strand cDNA synthesis was
followed by directional cloning into the
EcoRI/XhoI sites of the
Zap Express arms.
Ligated phage DNA was packaged in vitro using Gigapack II Gold packaging extracts (Stratagene). The libraries were subsequently amplified using XL1Blue MRF
as an host strain. 3.5 × 105 and 9 × 105 bacteriophage particles
for the oligo(dT)-primed and the random-primed cDNA libraries,
respectively, were plated at near confluence and duplicate filters were
prepared. Differential screening with 32P-labeled probes
was performed as described for Northern blot analysis.
Zap clones were
grown with helper phages, and plasmids containing cDNA inserts were
rescued as pBK-CMV plasmids following the manufacturer's instructions.
Sequencing was performed with overlapping subclones generated by either
unidirectional deletions using internal restriction sites or an ExoIII
nested deletion kit (Pharmacia), or by direct cloning of isolated
fragments. Plasmid DNAs were affinity purified on Nucleobond AX
Cartridges (Macherey-Nagel). Automated DNA sequencing were performed
using the Taq dye primer cycle sequencing procedure with an
Applied Biosystems Inc. model 373 apparatus. Homology searches were
performed using the entire GenBankTM and the NBRF/PIR data bases with
the help of the French Bisance service (17).
Characterization of the IAP Age-related Transcripts
ge-
elated transcripts (IAP-AR1 and -AR2) are
of abnormal size as compared to the canonical 7.2- and 5.4-kb
transcripts classically observed in the mouse tissues, and they were
first characterized by an extensive analysis using various IAP
probes (see scheme in Fig. 1E).
Fig. 1.
Northern blot characterization of the
age-induced IAP-AR1 and -AR2 transcripts using various probes
(A-D), and schematic representation of their structure
(E). Northern blot analysis was with 10 µg of total
RNA from liver of different inbred mice on each lane; the two canonical
7.2- and 5.4-kb IAP transcripts are indicated with open
arrowheads, and the age-dependent liver-specific IAP-AR1 (10 kb) and IAP-AR2 (6 kb) transcripts with filled
arrowheads; positions of the 28 S and 18 S rRNAs are indicated
(r); the length and position of each probe (except for the
env one) is indicated in E. Panel A,
induction of the IAP-AR1 and -AR2 transcripts upon aging in the liver;
y, 6-week-old DBA/2 mouse; o, 24-month-old DBA/2
mouse; probe, full-length IAP. Panel B,
characterization of the IAP domains within the IAP-AR transcripts;
liver RNA from 24-month-old DBA/2 mouse were hybridized with probes
encompassing different IAP domains (see E), or an
env probe (a 1.45-kb env transcript is indicated
by an arrow). Panel C, characterization of the
non-IAP domains within the IAP-AR and related transcripts; membranes as
in B were hybridized with probes encompassing different domains of the pA3-13 cDNA (identified as positive for the IAP-AR transcripts, see "Results"); positions of the probes are indicated in E. Panel D, characterization of the IAP-AR
transcripts in the liver of old mice (20-24 months) from various
inbred strains (left) and comparison between liver RNAs from
young (14 weeks, y) and old (24 months, o) BALB/c
mice (right); total RNA (10 µg in each lane) from the
liver of the indicated mice were analyzed using probe XH. E,
schematic representation of the IAP-AR transcripts and localization of
the probes; the structure of the IAP-AR1 and -AR2 transcripts (see
sequence in Figs. 3 and 4) is shown, with "+" and ""
denoting, respectively, positive and negative hybridization on the
Northern blots with the corresponding probes; black boxes indicate the B1 and B2 repeats, and open boxes the IAP
sequence (with the U3-R-U5 domains within the LTRs) and the 966-nt ORF at the 3
end.
[View Larger Version of this Image (69K GIF file)]
-LTR (IAP-5
), with the two probes in the
pol region (IAP-pol1 and IAP-pol2), as well as with an LTR
probe (not shown), in all cases detected the two IAP-AR transcripts
(Fig. 1E). Only the IAP-3
probe, located just 5
to the
3
-LTR, failed to detect IAP-AR2, but still detected IAP-AR1. Altogether, these results indicates that the 10- and 6-kb transcripts contain closely related IAP domains, characteristic of type I, type
I
1, or type IIA elements. The features of the IAP-AR2 transcript do
not correspond to those of any IAP transcripts characterized to date,
and could result from a 3
deletion within the progenitor IAP, or from
a post-transcriptional modification; the IAP-AR1 transcript is not
associated with a still hypothetical full-length gag-pol-env
IAP, and most probably corresponds to an extended 5
or 3
read-through.
read-through (its level of
transcription, in the sense orientation, was shown to vary among mouse
strains as the canonical IAP transcripts (7), strongly suggesting that it is initiated within an IAP LTR). Accordingly, the strategy for
cloning was to screen the randomly primed cDNA library with a
"border" IAP probe (IAP-3
) and to select positive clones also containing non-IAP sequences. Twenty positive clones were therefore isolated and purified, and inserts subsequently excised as pBK plasmids
(see "Experimental Procedures"). The fraction of IAP sequence
contained in the cDNA inserts was determined after restriction of
the plasmid DNAs, and hybridization of the blots on which they had been
transferred with an IAP probe. Out of the 20 positive bacteriophage
clones, 8 had non-IAP fragments larger than 500 bp and were sequenced.
Hybridization of a XhoI-HindIII fragment (probe
XH, see Fig. 1, C and E) originating from the 3
region of one of the candidate clones (pA3-13; cDNAs were cloned in an oriented way), finally allowed us to detect the 10-kb IAP-AR1 transcript. Surprisingly, this fragment also detected the 6-kb IAP-AR2,
as well as a 3-kb transcript, thus suggesting that these are
alternatively spliced forms of the IAP-AR1 transcript. As illustrated
in Fig. 1D using the same probe, these three transcripts could be detected in the liver mRNA originating from old mice of
four different mouse strains (with varying intensity depending on the
strain, as previously observed (7) for an IAP probe), and they were all
induced upon aging. These results suggest that the IAP-AR transcripts
are originating from the same locus and confirm that their induction is
a general feature related to aging in the liver. The complete
characterization of the IAP-AR1 transcript was then achieved using
probe XH, which was used to screen the 18-month-old liver
oligo(dT)-primed cDNA library to clone the 3
end of the 10-kb
IAP-AR1 transcript (clones pB2-16 and pB4-28, Fig. 2).
Screening of the randomly primed cDNA library finally allowed us to
identify a clone containing the 5
end of the transcript (clone pIII-3,
Fig. 2; see below). Sequencing of the subclones provided the composite
cDNA sequence of the 10-kb IAP-AR1 transcript (Fig.
3).
Fig. 2.
Structure of the isolated cDNA
clones. The isolated cDNA clones (name on the
right) are represented with solid lines, with the
detailed structure of the chimeric IAP-AR1 transcript below (as in Fig.
1). SD, splice donor; SA, splice acceptor.
[View Larger Version of this Image (14K GIF file)]
Fig. 3.
Nucleotide sequence of the IAP-AR1
transcript. The sequence of the IAP-AR1 transcript is derived from
the complete sequencing of the pIII-3 and pA3-13 cDNAs, and
partial sequencing of the pB4-28 and pB2-16 clones (see Fig. 2); part
of the internal IAP sequence is not represented (but complete sequence
is available from GenBank, accession number U70139[GenBank]). The IAP LTRs (with
U3, R, and U5) are boxed, as well as the B1 and B2 repeats and the 3 end ORF; the splice donors (SD) and acceptors
(SA) are indicated and underlined.
[View Larger Version of this Image (79K GIF file)]
: out of 10 clones positive for probe XH that we
isolated from the oligo(dT)-primed cDNA library, one met these
combined criteria (pB2-12, Fig. 2) and was entirely sequenced.
Similarly, a cDNA for the 3-kb transcript should be positive for
the XH probe, negative for both the IAP-3
and the IAP-pol probes, and
about 3 kb long: combination of these four criteria allowed us to
identify a corresponding 3-kb cDNA (clone pB1-19, Fig. 2) that was
also entirely sequenced.
half, the characteristic features of
an IAP element. Two cDNA clones have been isolated with almost
exactly the same most 5
sequence, extending (within 10-15 bp) to the
expected IAP transcription start site located at the 5
end of the LTR
R domain (Fig. 3). The downstream IAP sequences then disclose an
overall 97% identity to the sequence of a type I
1 IAP element, and
more precisely to the previously cloned and entirely sequenced IAP-IL3
element (12). Both carry the same internal deletion covering the 3
end
of the putative gag gene and extending into the 5
end of
the putative pol gene. However, due to the occurrence of six
stop codons within the normally in-phase fusion
gag-pol gene, the long (3060 nt) contiguous ORF in IAP-IL3 is reduced to 1776 nt in IAP-AR. Translation and homology searches, however, indicate that both coding domains are still 99%
identical. The 3
-LTR contains the typical U3, R, and U5 subdomains, and is bracketed by the characteristic 4-bp inverted repeats (TGTT and
AACA). Alignment of the IAP-AR and IAP-IL3 LTR sequences discloses strong similarities (97.6% identity, excluding the different number of
internal repeats in the R domain, see below). Actually, the IAP-AR LTR
sequence closely resembles that of the IAP elements specifically
expressed in plasmacytomas, i.e. the PC-type IAPs according
to the nomenclature in Refs. 19 and 20, and is clearly distinct from
the LTR of IAP elements specifically activated in the thymus or in
LPS-stimulated B cells, namely the LS-type IAPs (19, 20). In the IAP R
domains, internally located 13-bp direct repeats, from 1 to 6, are
commonly observed (19), which are most probably generated by the
retrotransposition process (2). The IAP-AR sequence contains six such
repeats, whereas IAP-IL3 has only one. Finally, it is noteworthy that
the IAP-AR R domain contains a canonical (AATAAA) polyadenylation
signal which, furthermore, is located within a domain 100% identical
(at least within 100 bp) to that of the "functional"
polyadenylation signal (2) of the IAP-IL3 sequence.
end, consistent with the fact that IAP sequences
are found inserted preferentially close to such repeats (23). At the 3
end of the IAP-AR1 cDNA, a 966-nt ORF (nt 7661-8632) can be found
(see below), which is followed by a 1104-nt (nt 8633-9718) non-coding
region ending with a poly(A) tail. As expected, a putative
polyadenylation signal (AATAAA, nt 9704-9709) can be found 15 nt 5
to
this poly(A) tail.
Fig. 4.
Sequences of the splice junctions within the
6- and 3-kb IAP-related transcripts. Sequences of the 6- and 3-kb
transcripts are derived from clones pB2-12 and pB1-19, respectively.
Vertical lines indicate homology with the unspliced 10-kb
IAP-AR1 transcript. Splice donors and acceptors (SD and
SA) are indicated with the corresponding consensus sequences
in italics and the essential GT and AG nucleotides in
bold; splice junctions are indicated with an
arrow. A, alignment of the nucleotide sequences
of IAP-AR1 and the 3-kb transcript, at the IAP-to-IAP splice junction.
B, alignment of the nucleotide sequences of IAP-AR1 and the
6- and 3-kb transcripts, at the IAP-to-cellular splice junction.
[View Larger Version of this Image (19K GIF file)]
region (probe XH in the ORF and probe UTR, Fig. 1E). As indicated in Fig. 1 (C and E), probes XH and UTR
hybridized to the 6-kb IAP-AR2 and 3-kb transcripts but probes A, B,
and C did not. This indicates that the pB2-12 and pB1-19 clones are
not recombination artifacts, but are genuine reverse transcripts of the
6-kb IAP-AR2 and 3-kb transcripts, and that both are being definitely
generated by splicing. Analysis of the Northern blot in Fig.
1D further strongly suggests (in addition to cDNA
sequencing) that these two spliced transcripts derive from IAP-AR1, as
the intensities of all three transcripts vary (both upon aging and upon
the mouse strain tested) in a similar manner. Fig. 1D
(right) also shows that occurrence of the IAP-AR1 and
IAP-AR2 transcripts most probably does not result from a reduction in
splicing efficiency from the 3-kb transcript upon aging (nor from a
switch from an hypothetical other promoter for the ORF), as the 3-kb
transcript can hardly be detected in the liver of the young mice (and
as no other ORF transcripts, with sufficient intensity, are observed). The fact that all three transcripts are expressed at a variable level
depending on the mouse strain as similarly observed for the canonical
IAP transcripts (see Ref. 7) is also consistent with the IAP LTR as
being the common promoter for all three transcripts.
end of the IAP-AR
transcripts contain a 966-nt ORF with the potential to encode a
322-amino acid protein. Since most important genes are evolutionarily
conserved, we made a Zooblot analysis of DNAs from different species,
with a probe encompassing the entire ORF together with its
3
-untranslated region (probe pB2-16, see Fig. 2). As illustrated in
Fig. 5, several species (human, hamster, cat, rat, and
mouse) demonstrated hybridization under moderately stringent
conditions, whereas no hybridization could be detected in
Saccharomyces cerevisiae or Drosophila
melanogaster. DNA from monkey, dog, cow, rabbit, and chicken were
also positive under similar conditions (data not shown), thus
demonstrating at least a strong conservation of the ORF among
vertebrates.
Fig. 5.
Zooblot analysis of the ORF-containing
locus. Filter of BglII-digested DNA from the indicated
species (5 µg/lane) was hybridized with probe pB2-16. The filter was
washed under moderately stringent conditions (2 × SSC, two 15-min
washes at 65 °C) and exposed for 24 h. The positions of
molecular size standards are indicated on the left.
[View Larger Version of this Image (37K GIF file)]
coding region of the yeast CCR4 gene (25). As illustrated in
Fig. 6, alignment of the IAP-AR ORF with the yeast CCR4
protein sequence shows an overall identity of 28%, with 36% identity
over amino acids 507-612 and amino acids 754-828. Similarity searches
against the GenBank and the DBEST data bases using the tBlastN program
(which compares a protein query sequence against a nucleotide data base
translated in all reading frames) also revealed similarities to two
human ESTs, clones T87026[GenBank] and H45114[GenBank] (Fig. 6). These two ESTs are not
overlapping: clone H45114[GenBank] is 35% identical (in amino acid) to the 5
end of the IAP-AR ORF, and clone T87026 is 70% identical to its 3
end
(Fig. 6). Although these two ESTs are derived from two different
cDNA clones, they may be part of the same gene, which, then, could
be the human homologue of the yeast CCR4 gene. As
illustrated in Fig. 6, it is worth mentioning that numerous residues
are conserved among these three species (human, mouse, and yeast).
Fig. 6.
Comparison of the predicted amino acid
sequence of the 3 end ORF within the IAP-AR transcripts with the yeast
CCR4 and human ESTs. Sequences are represented with the
single-letter code; numbers refer to the yeast
CCR4 amino acid sequence (PIR accession number S36713[GenBank]), and
dashes represent gaps introduced to optimize alignment;
H45114 and T87026 are human ESTs. Amino acid identities of the IAP-AR
ORF with yCCR4 are boxed, and those with H45114 or T87026
are underlined.
[View Larger Version of this Image (68K GIF file)]
Fig. 7.
Characterization of the ORF-containing locus.
A, Southern blot analysis of DBA/2 mouse genomic DNA (5 µg) restricted with the indicated enzymes and hybridized with probe
pB2-16 as in Fig. 5. Plasmid DNA (pB2-16) corresponding to 1 or 2.5 gene copies were deposited as a standard. B, gene mapping by
in situ hybridization with probe pB2-16 (see
"Experimental Procedures"). A series of 100 metaphase cells were
examined: there were 149 silver grains associated with chromosomes and
61 of these (41%) were located on chromosome 3. The distribution of
grains on this chromosome was not random; 51/61 (83%) of them mapped
to the B-D region, as illustrated on the figure.
[View Larger Version of this Image (28K GIF file)]
1 IAP element
originating from a single locus. The 10-kb IAP-AR1 transcript
corresponds to a transcriptional read-through, extending out of the
3
-LTR into 5 kb of cellular flanking DNA containing a 1-kb
evolutionarily conserved open reading frame with similarity to the
yeast CCR4 transcription factor. This large transcript gives rise to
two smaller transcripts by post-transcriptional modifications, the
IAP-AR2 transcript by an IAP env-to-cellular DNA splice, and
a 3-kb transcript by an additional internal IAP gag-to-env splice. Due to the occurrence of
transcripts from other IAPs, it was not possible to perform primer
extension experiments to definitely characterize their transcription
start sites. However, the overall size of the composite cDNAs as
well as the sequence of the most 5
-clones that we isolated strongly
suggest that their transcription is initiated, as for canonical IAP
elements, in the R domain of the IAP 5
-LTR.
-LTRs. Mietz et al. (19) have actually described IAP
cDNA from splenic B cells with deletions of the 3
R domains, and,
in the course of our cloning experiments, we have also isolated two chimeric IAP transcripts with truncations at the 3
end of the IAP
elements (complete deletion of the 3
-LTR, or R-U5 deletion) followed
by 100-500 bp of non-IAP DNA.2 For the
IAP-AR transcripts, however, their complete nucleotide sequencing has
revealed a rather unique situation in which a read-through takes place
without any evidence for a deletion or mutation within the IAP 3
-LTR
(see "Results"). Although less well documented, sequencing of the
human HERV-R element responsible for a read-through to a 3
located
Krüppel-like gene (26, 27) also did not reveal evident
alterations within the 3
-LTR, therefore suggesting the possible
existence of mechanisms modulating polyadenylation efficiency within
retroid elements, so as to account for such events.
end of the truncated env gene, in a
highly conserved region among IAPs. Rather surprisingly, the latter
sites have never been described previously, but their position is
reminiscent of internal env donor splice sites commonly
found in complex infectious retroviruses (e.g. HIV and
HTLVI), where they are responsible for the synthesis of small
regulatory viral proteins (e.g. Tat and Rev; reviewed in
Ref. 28). The presence of functional donor sites at a similar location
in the IAP sequence could be an additional hint for the retroviral
origin of these now strictly endogenous retroid elements.
1 IAP
element, closely related to the previously cloned and characterized
IAP-IL3 (12). The absence of a generalized induction of the IAPs in the
liver of the aged mice and the rather "localized" induction of a
definite and canonical element, therefore, strongly suggest that it is
not the nature of the IAP element per se but rather its
chromosomal position and environment (chromatin structure, DNA
methylation, presence of adjacent age-specific genes or regulatory
sequences, see below) that is determinant for the expression of these
otherwise severely repressed mutagenic elements. This conclusion is in
fact fully consistent with previous experiments in transgenic mice (6)
with marked IAP elements (among which, precisely, IAP-IL3), where IAP
expression was found to be restricted to the male germline, and was
never observed in somatic tissues (even in those of aged
mice).3 Altogether, these two series of
results also strongly suggest that the transcripts detected by global
Northern blot analysis in most mouse somatic tissues (reviewed in Refs.
1 and 7) most probably originate from a very limited number of IAP
elements and not from the overall IAP population (see also Refs. 19 and 20), and are therefore "singularities" reflecting DNA
site-specific, tissue-dependent, and (in the present case)
age-dependent, IAP inductions.
1 element, therefore not autonomous for
transposition. However, this specific induction might not be
"neutral" for the aging mouse. Actually, the age-induced transcripts include a 3
adjacent cellular sequence containing a large
ORF with strong phylogenetic conservation and similarity to the yeast
CCR4 protein. Although we do not know yet whether the age-induced
transcription results in the actual production of CCR4-like proteins
(antibodies are presently being raised to a recombinant protein to test
this point), it is possible that this ORF, as well as in fact part of
the defective IAP, is a candidate "auto-gene" as defined in Ref.
40, and is therefore involved in auto-immune reactions in the aging
mice. It could also encode a factor involved in transcriptional
regulations: actually, the CCR4 protein is a transcription factor for
the glucose-repressible genes in yeast (41), and recently Draper
et al. (42) identified a mouse protein that binds to the
yeast CCR4 protein, resulting in a protein complex with evolutionarily
conserved transcriptional functions. The presently identified ORF could
be an exon of the mouse homologue (which still remains to be
characterized) of the yeast CCR4 gene, and the IAP sequence
be inserted within an intron of the corresponding gene. Accordingly,
induction of IAP transcription could result in the synthesis of a
truncated CCR4-like protein with distinct functional properties, which
as such might play a role in aging.
*
This work was supported in part by CNRS, by Association pour
la Recherche sur le Cancer Grants 6552 (to T. H.) and 6372 (to M.-G.
M.), and by Rhone-Poulenc-Rorer (Bioavenir project). The costs of publication of this
article were defrayed in part by the
payment of page charges. The article
must therefore be hereby marked
"advertisement" in
accordance with 18 U.S.C. Section
1734 solely to indicate this fact.
§
Supported by a Bioavenir fellowship.
¶
Present address: Dept. of Cell Biology, Albert Einstein
College of Medicine, 1300 Morris Park Ave., Bronx, NY 10461.
**
To whom correspondence should be addressed. Fax: 33-1-42-11-52-76;
E-mail: heidmann{at}igr.fr.
1
The abbreviations used are: IAP, intracisternal
A-particle; kb, kilobase(s); nt, nucleotide(s); LTR, long terminal
repeat; ORF, open reading frame; PCR, polymerase chain reaction; bp,
base pair(s); EST, expressed sequence tag.
2
A. Puech, unpublished data.
3
A. Dupressoir, unpublished data.
©1997 by The American Society for Biochemistry and Molecular Biology, Inc.