(Received for publication, October 26, 1994; and in revised form, January 4, 1995)
From the
A genomic clone for a mouse S-adenosylmethionine
decarboxylase (AdoMetDC) gene was isolated from a cosmid library.
Surprisingly, the gene proved to be intronless. With the exception of
three base substitutions (changing 2 amino acids in the deduced
protein), the 1002-nucleotide sequence of the open reading frame was
identical to that of mouse AdoMetDC cDNA. Moreover, the gene contained
a poly(dA) tract at the 3` end and was flanked by 13-base pair direct
repeats. Our findings suggest that this gene has arisen by
retroposition, in which a fully processed AdoMetDC mRNA has been
reverse transcribed into a DNA copy and inserted into the genome. By
polymerase chain reaction, we positively identified the intronless gene
in the mouse genome, and, by primer extension analysis, we proved the
gene to be functional. Thus, its transcripts were found in many cell
lines and tissues of the mouse and were particularly abundant in the
liver. When the open reading frame of the intronless gene was expressed
in Escherichia coli HT551, a strain with no AdoMetDC activity,
it was found to encode a 38-kDa protein, corresponding to AdoMetDC
proenzyme. Although the change of methionine 70 to isoleucine was close
to the cleavage site at serine 68, this protein underwent proenzyme
processing, generating a 31-kDa subunit and an 8-kDa
subunit. Importantly, the protein encoded by the intronless gene was
functional, i.e. it catalyzed the decarboxylation of S-adenosylmethionine, and its specific activity was comparable
with that of recombinant human AdoMetDC purified according to the same
procedure.
S-Adenosylmethionine decarboxylase
(AdoMetDC; EC 4.1.1.50) is the rate-limiting enzyme in the
biosynthesis of spermidine and spermine (1) . These polyamines
and their diamine precursor putrescine play important roles in cell
growth and
differentiation(2, 3, 4, 5) .
Therefore, the rate-limiting biosynthetic enzymes, ornithine
decarboxylase and AdoMetDC, are useful targets for chemotherapeutic
agents. Some inhibitors of these enzymes exert strong therapeutic
effects in proliferative and parasitic
diseases(1, 4, 5, 6) . A role of di-
and polyamines in tumor cell growth is suggested by the finding that
overproduction of ornithine decarboxylase is associated with neoplastic
transformation(7, 8, 9) . Recently, ornithine
decarboxylase has been shown to be a mediator of c-myc-induced
apoptosis(10) .
The polyamines also protect cells and cell
components from oxidative damage(11) . They form integral parts
of many biologically important molecules such as bleomycin
A, a cationic antibiotic from Streptomyces
verticillus(12) , squalamine, an aminosterol antibiotic
from the dogfish shark Squalus acanthias(13) , and the
venom of the funnel-web spider Agelenopsis aperta(14) . Moreover, spermidine contributes a portion of its
structure to form hypusine, an amino acid responsible for the
post-translational modification of eukaryotic translation initiation
factor 5A(15) .
AdoMetDC catalyzes the production of decarboxylated S-adenosylmethionine(1) . This is the aminopropyl group donor both in the conversion of putrescine to spermidine (catalyzed by spermidine synthase) and of spermidine to spermine (catalyzed by spermine synthase). Under physiological conditions, decarboxylated S-adenosylmethionine is a limiting factor in polyamine synthesis. Although ubiquitous in eukaryotic cells, AdoMetDC constitutes only a minor fraction of the intracellular proteins. This is partly due to its very short half-life and partly due to the fact that AdoMetDC expression is regulated at multiple levels, transcriptional, translational, as well as post-translational(1, 16) . Interestingly, there is evidence suggesting that the polyamines act as feedback regulators at all of these levels(1, 16) . AdoMetDC expression is induced by hormones, growth factors, tumor promoters, and other stimuli affecting growth(1, 3, 4, 5, 16) .
Cloning and sequencing of human(17) , bovine(18) ,
hamster(19) , rat (17) , and mouse (20, 21) AdoMetDC cDNAs have shown that the mammalian
enzyme is synthesized as a 38-kDa proenzyme (333-334 amino acids)
with no enzymatic activity. The proenzyme is autocatalytically cleaved
into a 31-kDa subunit (265-266 amino acids) and an 8-kDa
subunit (67 amino acids), generating the pyruvate prosthetic
group at the N terminus of the
subunit by
serinolysis(22) . The mammalian enzyme contains two pairs of
these nonidentical subunits (
) and
probably two catalytic centers (1) . Both subunits seem to be
necessary for catalytic activity. The amino acid sequence of the
protein is highly conserved (about 90% identical) among mammalian
species(17, 17, 18, 19, 20, 21) .
AdoMetDC genes have been cloned and sequenced from Escherichia coli(23) , Saccharomyces cerevisiae(24) , rat(25, 26) , and human (27) sources. In addition to these functional AdoMetDC genes, a processed pseudogene has been identified in the rat genome(28) . The objective of the present study was to isolate a mouse AdoMetDC genomic clone from a cosmid library and to determine the primary structure of this important gene, with the ultimate goal of analyzing its transcriptional regulation. The mouse AdoMetDC gene that was cloned and sequenced (cSAMm1; EMBL Z23077) proved to be completely devoid of introns over its entire length. The presence of a poly(dA) tract at the 3` end as well as flanking direct repeats suggests that this gene has arisen by retroposition(29) , in which a fully processed AdoMetDC mRNA has been reverse transcribed into a DNA copy and inserted into the mouse genome. Of particular interest is our finding that this intronless gene has acquired a functional promoter, as is evident from our identification of the mRNA specifically encoded by the intronless gene (distinguished from the putative bona fide AdoMetDC gene by primer extension analysis). Expression analysis showed that this intronless AdoMetDC gene is strongly expressed in mouse liver. When the intronless AdoMetDC gene was expressed in AdoMetDC-deficient bacteria, it was found to encode a functional enzyme, despite the fact that the coding region contained three base substitutions (as compared with the cloned cDNA(20, 21) ), causing two amino acid substitutions in the enzyme.
DNA fragments were separated according to size by high resolution denaturing gel electrophoresis using a model S2 Sequencing gel electrophoresis apparatus (Life Technologies, Inc.), and the bands were visualized by direct autoradiography. GeneWorks 2.3 (IntelliGenetics) Macintosh software was used for the DNA sequence analysis.
Figure 3: Positive identification of the intronless AdoMetDC gene in the genome of various mouse strains and cell lines by PCR. The experimental design is shown in A, and the result is shown in B. A, Primer 1 corresponds to a sequence in the putative promoter of the intronless gene, and Primer 2 is complementary to a sequence in the ORF. In the presence of the intronless gene, PCR results in amplification of a unique 615-bp product. B, genomic DNA, isolated from inbred mouse strains and from mouse cell lines, was subjected to PCR, and the amplified material was separated by electrophoresis in a 1.5% agarose gel and visualized by UV light. Lane1, 129/SvJ mouse; lane2, DBA/2J mouse; lane3, Ehrlich ascites tumor cells; lane4, F9 teratocarcinoma stem cells; lane5, L1210 lymphoid leukemia cells; lane6, control PCR with no DNA added; lane7, 100-bp DNA ladder (Pharmacia).
PCR reactions
were carried out in a total volume of 100 µl with 0.5 µg of
genomic mouse DNA, 0.25 µM final concentration of each of
the two oligonucleotide primers, 2.0 units of Tth polymerase, 200
µM final concentration of each of dGTP, dATP, dCTP, and
dTTP in the buffer (10 mM Tris-HCl (pH 8.9), 0.1 M KCl and 0.15 mM MgCl) supplied with the Tth
polymerase (Boehringer Mannheim). All PCRs were carried out in a DNA
thermal cycler under the following conditions: 2 min of denaturation at
95 °C and 30 cycles of denaturation at 95 °C for 30 s,
annealing at 60 °C for 40 s, and extension at 72 °C for 1 min
completed by a final extension at 72 °C for 7 min. An aliquot of
the amplified material was loaded on a 1.5% agarose gel containing
ethidium bromide. Amplified material was visualized by UV light.
Figure 4:
Identification of the specific transcripts
for the intronless and the bona fide mouse AdoMetDC genes by
dideoxy-ATP-terminated primer extension analysis. The experimental
design is shown in A, and the results are shown in B.
The primer (Primer Tsn4) was designed such that the extension product
would be 25 nt long in the presence of a transcript from the intronless
gene and 31 nt long in the presence of a transcript from the bona
fide gene (corresponding to the published mouse
cDNAs(20, 21) ). Thus the primer was complementary to
a sequence (1811-1791) in the 3`-UTR. Primer extension reactions
were carried out in the presence of P-labeled primer and
dideoxy-ATP using total RNA from the liver (lane1),
lung (lane2), spleen (lane3),
kidney (lane4), and testis (lane5) of 129/SvJ mice and from F9 teratocarcinoma stem cells (lane6), Ehrlich ascites tumor cells (lane7), and L1210 lymphoid leukemia cells (lane8). The extension products were fractionated by PAGE and
detected on autoradiographic film using an amplifying
screen.
For the
analysis of cSAMm1 expression by primer extension, 50 µg of total
RNA and 1.25 10
cpm of Primer Tsn4 (5` end-labeled
with [
-
P]ATP; 3,000 Ci/mmol) were used per
reaction(34) . For termination by dideoxynucleotides, dATP was
replaced by dideoxy-ATP. In each reaction, 16 units of avian
myeloblastosis virus reverse transcriptase were used. Extension
products were treated with RNase A (20 mg/ml) for 30 min, extracted
with phenol/chloroform, and finally ethanolprecipitated and
fractionated by PAGE (15% gel containing 8 M urea). The gel
was dried, and the radioactivity was detected on autoradiographic film
using amplifying screens.
E. coli strain HT551
carrying the appropriate expression vector construct (pCQV2A or
pCQcSAM) were grown overnight at 32 °C in M9 minimal medium. They
were then diluted 50-fold, and 4 h later the temperature was raised to
42 °C to induce expression. After 2 h, the cells were exposed to
[S]methionine (50 µCi/ml) for 5 min. Then
(time = 0) incorporation was stopped by adding cold methionine
(final concentration, 100 µg/ml), and the cells were kept growing.
Samples were taken at 0, 5, 10, 15, 20, 30, 45, and 60 min. Cells were
sonicated in 10 mM Tris-HCl (pH 7.5) containing 0.1 mM EDTA, 0.5 mM dithiothreitol, 0.1% bovine serum albumin,
0.1% Triton X-100, 0.1% SDS, and 0.1% Tween 80. After centrifugation
for 30 min at 30,000
g, 4 °C, the supernatant was
incubated for 30 min at room temperature with antiserum to recombinant
human AdoMetDC (see above), whereupon protein A was added and the
incubation continued for a further 30 min. After thorough washing with
the above buffer, immunoprecipitated proteins were solubilized in
SDS-PAGE loading buffer and subjected to SDS-PAGE (12.5% gel). Gels
were incubated in Amplify (Amersham Corp.), and proteins were
visualized by fluorography.
Figure 1:
Nucleotide sequence of the ORF of the
intronless mouse AdoMetDC gene and the deduced amino acid sequence of
the corresponding protein as compared with the mouse AdoMetDC cDNA (dash, same nucleotide) and amino acid sequences (no mark,
same amino acid)(20, 21) . Numbering of the
nucleotides begins with the putative transcription start site
(+1), and numbering of the amino acids begins with the initiating
amino acid (Met-1). Out of the 1002 nucleotides specifying amino acids,
there are three substitutions (boxes) resulting in two amino
acid replacements. The point mutations affecting nucleotide 540
(GA) and nucleotide 746 (C
T) result in Met-70
Ile
and Ala-139
Val substitutions, respectively. The point mutation
at nucleotide 858 (A
C), however, is in the third position of a
codon and does not specify a different amino acid. The 334-amino acid
sequence corresponds to the 38-kDa AdoMetDC proenzyme, which is
processed to a 31-kDa
subunit and an 8-kDa
subunit(1) . The bond split is likely to occur between glutamic
acid 67 and serine 68 (*). In this process, the serine is converted to
pyruvate, which becomes the N terminus of the large
subunit and
acts as the prosthetic group of the enzyme. The glutamic acid becomes
the C terminus of the small
subunit. cDNA*, sequence
from(20) .
The start site for AdoMetDC mRNA transcription has been identified in the human (27) and rat (26) genomes. It is localized to the same G residue (within the 5`-CTCGCTT-3` context) in both species, although the 5`-UTR of the rat mRNA is 5 nt longer (325 nt as compared with 320). When comparing the intronless mouse AdoMetDC gene sequence with that of the human and rat AdoMetDC genes, the nucleotide sequence from the putative transcription start site and downstream is identical for at least 40 nt. Therefore, it is very likely that the transcription of the intronless gene starts at the same G residue, which is consequently numbered +1. On this assumption, the 5`-UTR of the mRNA encoded by the intronless AdoMetDC gene is 330 nt long (Fig. 1).
In addition to the ORF coding for AdoMetDC, there is a small ORF in the 5`-UTR of the intronless gene. This ORF is also present in other mammalian AdoMetDC genes(39) . It codes for a hexapeptide (MAGDIS or Met-Ala-Gly-Asp-Ile-Ser), which appears to suppress translation of AdoMetDC mRNA in a cell-specific manner(39, 40, 41) . The 5`-UTR of the intronless AdoMetDC gene also has a high G+C content, implying that it may have stable secondary structures affecting its translation(42, 43) .
Downstream of the termination codon for the intronless AdoMetDC gene, there are at least two sets of potential polyadenylation signals (Fig. 2). Their positions correspond to those found in human (27) and rat (26) AdoMetDC genes and in mouse AdoMetDC cDNA(20, 21) . The most upstream signal (AATTAAA) at position 1869, yields a 3`-UTR of >540 nt; the actual length depending on the site and extent of polyadenylation. The second signal, at position 3173, is identical to the typical polyadenylation signal AATAAA(44) , yielding a 3`-UTR of >1843 nt; the actual length again depending on the site and extent of polyadenylation. The numbers of nucleotides between the putative transcription start site and the polyadenylation sites are 1872 and 3175, respectively. When taking into account that a poly(A) tail is added, these figures are in agreement with the 2.1 and 3.4-kilobase AdoMetDC mRNA species found in mouse tissues and cell lines(17) . Therefore, utilization of both polyadenylation signals in the intronless gene will contribute to the formation of transcripts, which are likely to be indistinguishable from the transcripts derived from the bona fide AdoMetDC gene.
Figure 2:
Comparison of the human (A) (27) and rat (B) (25, 26) AdoMetDC
genes with the intronless AdoMetDC gene of the mouse (C). The
exons (E1-E9) and the corresponding regions of the
intronless gene are depicted by boxes, with the openboxes referring to the protein-coding region, and the closedboxes referring to the 5`- and 3`-UTRs. The
boundaries of the intronless gene are defined by a 13-bp direct repeat
(AAGAAACATTCTA). All three genes contain two major
polyadenylation/termination sequences in their 3`-UTR, the one most
upstream being AATTAAA and the other
AATAAA(25, 26, 27) . The 5`-flanking region
of the human and rat genes possess TATA boxes, and in the corresponding
region of the intronless mouse gene there is a TATA-like box (TATTAAT)
at -28 (the number refers to the first nucleotide in the
sequence). The mouse AdoMetDC gene shares with long interspersed
elements (46) their four canonical structural features: (a) it does not contain introns, (b) it represents a
full-length copy of the processed transcript from the functional gene, (c) it contains a poly(dA) tract (dA) at the 3`
end, and (d) it is flanked by target site duplications, i.e. the cellular DNAs adjacent to the retrotransposed
structure display direct repeats.
The bona fide mouse AdoMetDC gene(s) has not yet been isolated and sequenced, but the corresponding rat genes are known to be interrupted by seven introns(26) , and the human AdoMetDC gene has one additional intron (27) (Fig. 2). Exon-intron junctions in these genes are in identical positions except that intron 6 of the human gene is missing in the rat gene (Fig. 2, A and B)(27) .
In the rat genome, two distinct but closely related AdoMetDC loci have been found, both located on chromosome 20(26) . Despite some differences between their exon sequences, the genes code for identical proteins. The 5`-flanking regions upstream of nt -63 are totally different. Both promoters appear to be efficient but controlled by different sets of transcription factors. In view of their structures and chromosomal location, it has be suggested that these AdoMetDC genes have arisen from a recent duplication event in the rat genome(26) . In the mouse genome, a functional duplication of an AdoMetDC gene instead seems to have arisen through retroposon recruitment. The fact that the intronless gene has acquired a unique 5`-flanking region and has lost introns that, in the corresponding rat gene, contain potential promoter and enhancer elements(26) , suggests that the transcriptional regulation of the intronless AdoMetDC gene is completely different from that of the bona fide gene. These changes may also imply that the transcription of the intronless AdoMetDC gene is not subject to the same feedback regulation by polyamines (1, 16) as the bona fide gene.
AdoMetDC gene sequences have been mapped to human chromosomes 6 and X with the use of a panel of human-mouse somatic cell hybrids(49) . In agreement, Maricet al.(27) found AdoMetDC gene-related sequences in DNA libraries specific for human chromosomes 6 and X. They found the gene on chromosome 6 to be active. Since partial nucleotide sequencing revealed that the gene on the X chromosome lacked introns, it was suggested that this locus represents a processed AdoMetDC pseudogene(27) . Whether this human AdoMetDC-related gene sequence also represents a functional intronless gene or is merely a nonfunctional pseudogene remains to be determined.
Figure 5:
Analysis of the autocatalytic processing
of the protein encoded by the intronless mouse AdoMetDC gene. Human and
mouse AdoMetDC proteins were pulse-labeled with
[S]methionine for 5 min while overexpressed in E. coli strain HT551(37). At 0, 5, 10, 15, 20, 30, 45, and 60
min after radiolabeling, extracts were immunoprecipitated with a
recombinant human AdoMetDC antibody, and the precipitates were analyzed
by SDS-PAGE. Radiolabeled proteins were visualized by fluorography.
Migration of [
C]methylated protein standards of
the indicated molecular masses (kDa) is shown on the left. The
protein encoded by the intronless mouse AdoMetDC gene is equivalent to
the 38-kDa proenzyme of the human AdoMetDC and is shown to be processed
to a 31-kDa
subunit (and an 8-kDa
subunit).
The protein purified and
migrated in accordance with the 31-kDa subunit of human AdoMetDC (Fig. 6) and exhibited catalytic activity characteristic of
AdoMetDC (see the legend to Fig. 6) despite the two amino acid
substitutions. The specific activity of the purified mouse enzyme was
236 units/mg of protein, which is comparable with that of the human
enzyme (530 units/mg of protein) purified according to the same
procedure. One unit of enzyme activity is defined as releasing 1 nmol
of CO
/min. The essentially normal behavior of the protein
encoded by the intronless AdoMetDC gene is consistent with the fact
that the two amino acid substitutions (methionine 70 to isoleucine and
alanine 139 to valine) do not involve amino acids known to be essential
for AdoMetDC activity or proenzyme processing, as determined by
site-directed mutagenesis(1) , nor do they change the net
charge of the protein.
Figure 6:
Electrophoretic mobility of the protein
encoded by the intronless mouse AdoMetDC gene. Human (B) and
mouse (C) AdoMetDC proteins, expressed in E. coli strain HT551(37), were purified on a
methylglyoxal-bis(guanylhydrazone)-Sepharose affinity column, separated
by SDS-PAGE (12.5% gel) and stained with Coomassie Brilliant Blue.
Migration of protein standards of the indicated molecular masses (kDa)
is shown in A. The protein encoded by the intronless mouse
AdoMetDC gene purified and migrated in accordance with the 31-kDa
subunit of the human AdoMetDC and its specific activity (236 units/mg
of protein) was comparable with that of the human enzyme (530 units/mg
of protein) purified according to the same procedure. One unit of
enzyme activity is defined as releasing 1 nmol of
CO
/min.
Except for the two amino acid substitutions mentioned above, the primary structure of the protein encoded by the intronless AdoMetDC gene is identical to the human and rat AdoMetDC proteins ( Fig. 1and 2). Thus, structurally important domains of the AdoMetDC protein are unaffected by the mutations, including the only conserved region between eukaryotic and prokaryotic AdoMetDCs (1, 23) (amino acids 81-91 of the intronless gene), and the PEST region (50) (amino acids 243-269 of the intronless gene), which may be important for the rapid turnover of the enzyme.
It is interesting to note that multiple forms of
AdoMetDC have been observed in the rat(51) . Whether a gene
corresponding to the intronless AdoMetDC gene of the mouse, which is
strongly expressed in the liver, can account for the alternate form
observed in the rat liver remains to be determined. Irrespectively,
mouse tissues may have two types of AdoMetDC homodimers, those
containing two subunits encoded by the intronless gene and those
containing two
subunits encoded by the bona fide gene as well as
a heterodimer containing one
subunit of each.
Assuming that primordial genes evolved with introns(75) , the lack of introns in genes of higher eukaryotes can be due to (a) intron loss or (b) reinsertion into the genome of genetic material copied from mRNA by reverse transcription. The existence of an intron in the 5`-UTR of the human leukosialin gene, a gene that is devoid of introns in its coding region(63) , argues for intron elimination. On the other hand, our finding that the cloned mouse AdoMetDC gene is intronless over its entire length argues for a mechanism involving reverse transcription.
What is the biological significance of the multiplicity of AdoMetDC genes? Is it simply an evolutionary accident that brings no selective advantage to the organism, or have the distinct genes evolved to exercise specific functions? Probably, accidental gene amplification and fixation of gene families can occur as an essentially neutral event. Our finding, that the intronless AdoMetDC gene in mice exhibits a quantitatively distinct pattern of expression, suggests that it may have acquired a novel role in cell stimulatory activities. The fact that the AdoMetDC genes are not co-ordinately turned on in response to induction but that different genes are turned on to varying extents in different tissues and cells suggests that the individual AdoMetDC species may have further, as yet unrecognized, activities of importance in physiological growth control and differentiation.
Although the protein encoded by the intronless AdoMetDC gene is functional, we cannot exclude the possibility that mutations at the sites observed could lead to inadequate expression of the gene by impairing transcription, translation, and/or post-translational events. Because the formation of mRNA from the intronless AdoMetDC gene does not require splicing, less time may elapse between transcription of the gene and appearance of the mature message in the cytoplasm and its translation into functional AdoMetDC protein. Thus, a signal that leads to increased transcription of this gene may be more rapidly translated into increased levels of AdoMetDC protein than a signal stimulating the bona fide gene, which would produce a nonprocessed message that would have to be spliced.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) Z23077[GenBank].