(Received for publication, September 19, 1996, and in revised form, October 25, 1996)
From the Departments of Molecular Pathology,
§ Molecular Biology, and
Immunopathology, Tokyo
Metropolitan Institute of Gerontology, 35-2 Sakaecho, Itabashi-ku,
Tokyo 173, Japan
Rat nucleoside diphosphate (NDP) kinase is
composed of two isoforms ( and
) encoded by independent genes.
The mRNAs are expressed ubiquitously; however, the level of
expression is tissue-dependent and is also up- or
down-regulated under certain conditions, including growth stimulation,
differentiation, and tumor metastasis. To address the regulatory
mechanisms of gene expression for the rat NDP kinase major isoform
(an nm23-H2/PuF homologue), we identified the transcription initiation
sites in detail by RNase protection and 5
-rapid amplification of DNA
ends and located the core promoter region by chloramphenicol
acetyltransferase assay. The transcripts, initiated from an
extraordinarily wide range of sites, were categorized into two groups;
one transcribed from an upstream region was spliced in the untranslated
region (group 1), whereas the other initiated in the downstream region
was not (group 2). RNase protection demonstrated that the group 1 mRNA was the dominant form present in all tissues except heart and
skeletal muscle. In situ hybridization revealed cell-specific expression of these mRNA species. Furthermore, they differed in the translational efficiency (the group 2
>
> the
group 1
). These findings suggest that the regulation of the NDP
kinase expression at both transcriptional and posttranscriptional steps
could be fundamentally governed by the selection of transcription initiation sites.
Nucleoside diphosphate (NDP)1 kinase
(EC 2.7.4.6) plays a pivotal role in maintaining intracellular levels
of triphosphate nucleotides at the expense of ATP. It had been thought
to be a typical housekeeping enzyme (1), but recent studies have
revealed that the NDP kinase protein may have multiple regulatory
functions beside the phosphotransferase activity and that it behaves as a tumor metastasis suppressor (nm23; Refs. 2 and 3, reviewed in Ref.
4), a differentiation-inhibiting factor (5), and a transcription factor
for human c-myc (PuF; Ref. 6). These functions are
reportedly unrelated to NDP kinase enzyme activity per se
(7-9). NDP kinase forms a gene family in various organisms. In
mammalia, two isoforms of NDP kinase have been reported from humans,
mice, and rats (10-13). The rat isoforms, and
, are encoded by
distinct genes arranged in a tandem array (13). The two NDP kinase
isoforms display only small functional differences in their activity as
phosphotransferases to the extent examined (14). However, recent
studies have demonstrated that each of the two isoforms (
and
)
may possess its own specific functions in addition to their common
enzymatic properties: they are distinguished on the basis of one being
a homologue to the transcription factor for the human c-myc
gene (PuF/nm23-H2) and the other the candidate tumor metastasis
suppressor (nm23-H1).
NDP kinase expression is reportedly increased at specific stages of life of several organisms and in various types of cells under certain circumstances; for example, during formation of the imaginal discs at the larval stage of Drosophila melanogaster (15), in murine systemic organs at their organogenesis stages (16), in concanavalin A-stimulated human T lymphocytes (17), and in human diploid fibroblasts when immortalized by SV40 transformation and 60Co irradiation (18). Decreased expression of NDP kinase has been reported also in the slime mold Dictyostelium discoideum when aggregation and development of the multicellular organization was triggered by starvation (19) and in the case of tumor metastases in experimental animal systems and in certain human tumors (3, 4).
Beginning at the very early stage of the cloning and characterization of the rat NDP kinase mRNA, it was noticed that the quantities of mRNA are not directly related to those of the protein or to enzymatic activity: the NDP kinase mRNA levels vary more than 10-fold among different organs tested, whereas levels of the protein or enzymatic activity vary at most by 2-fold (20). Similar enigmatic phenomena were also observed with cell immortalization: the mRNA was increased severalfold, whereas the protein amount only increased at most 1.5-fold (18). These observations may suggest possible posttranscriptional regulations of the NDP kinase genes. Although several factors (or motifs) in mRNA have been reported as regulatory elements responsible for the posttranscriptional regulation of gene expression (reviewed in Refs. 21 and 22), such structural analyses of NDP kinase mRNA are totally lacking.
Previously we reported genomic structure and transcription start sites
for the major isoform ( isoform) of rat NDP kinase and assigned four
exons (23). Since then we have found another type of cDNA clone
coding for the
isoform, which is composed of five exons, including
an additional 5
-untranslated exon, and spliced differently, which we
have termed the long form (now renamed as group 1 type) mRNA, and
the previously reported type of the mRNA is termed the short form
(now renamed as group 2 type) (13). These observations have raised
questions as to the relative abundance of the two types of
isoform
transcripts, their exact transcription initiation sites, and regulatory
mechanism of their expression.
To address these questions we sought to identify the transcription
start sites for rat NDP kinase isoform mRNA in detail by RNase
protection in combination with the 5
-RACE method, and have identified
numerous initiation sites. Furthermore, RNase protection assay and
in situ hybridization analyses have revealed the tissue- and
cell-specific expression of these mRNA species. These data,
together with CAT assay data, which show a core promoter region for the
gene (Fig. 1), may provide a perspective of the complicated transcriptional regulatory mechanisms of the NDP kinase genes. A possible mechanism of the posttranscriptional regulation for
NDP kinase expression is discussed based on the finding that the rates
at which the transcripts are translated under in vitro conditions are different among the heterogeneous forms.
RNA Isolation
Systemic organs were harvested from Wistar rats aged between 12 and 16 weeks. Total RNA was isolated by the acid-phenol extraction method (24) with modifications. Briefly, samples were extracted twice in a premixed phenol and guanidinium isothiocyanate solution (ISOGEN; Nippon Gene Inc., Tokyo, Japan) and chloroform. Poly(A)+ RNA was further purified using an oligo(dT)-cellulose spin column (Pharmacia Biotech Inc.).
Ribonuclease Protection Assay
RNase protection assays were performed as described previously
with modifications (23). Briefly, four RNA probes were synthesized. To
identify transcription initiation sites we made two probes; -l and
-s, which correspond to the rat NDP
kinase
gene from
209 to
453 and from
9 to
204, respectively
(see Fig. 2). To quantify the two subtypes of the
mRNA, group 1 and group 2, two probes were generated from two RACE clones;
-1 is
generated from clone 35A and complementary to the 39 nucleotides of the 3
-end of the first exon (
244 to
206) and 46 nucleotides of the
5
-end of the second exon (
4 to +42), and
-2 is generated from a
group 2 type clone and corresponding to a fragment 70 nucleotides in
length (
28 to +42) (see Fig. 4). The numbering system used in this
study to identify the nucleotide location starts with the translation
initiation site (23). Full-length synthetic RNAs that represented group
1 and group 2 types of
and
mRNA were used as control
samples. The processed materials were electrophoresed in denaturing
gels supplemented with 40% formamide.
5-RACE Method
To ensure the isoform-specific cDNA synthesis, we used
primer 4S (5
-CCGAAGGAACTTCATGG-3
, +126 to +110), corresponding to the
3
-terminal stretch of the second exon of the
gene, where the two
homologous (
and
) genes denote the most divergent sequences between them. To obtain full-length cDNA in high efficiency, we applied the two-step reverse transcription reaction. Briefly, 2.5 µg
of poly(A)+ RNA and 2 mM primer in water were
heated at 85 °C for 10 min and then annealed at 65 °C for 15 min,
followed by the addition of reverse transcription buffer and reverse
transcriptase (SuperScript II; Life Technologies, Inc.) and
ribonuclease inhibitor (RNasin, Promega), and then incubated at
50 °C for 45 min. After heat denaturation at 85 °C for 10 min, a
second extension reaction was performed by further addition of fresh
reverse transcriptase. Next, excess amounts of RNA and the
single-stranded portion of RNA were eliminated by ribonuclease A and T1
(Ambion) digestion, and the resulting degraded RNA and excess amounts
of primer were removed by an S-300 MicroSpin column (Pharmacia), which
reportedly adsorbs single-stranded RNA and synthetic oligonucleotides
(25). The samples enriched with heteroduplex forms of RNA and cDNA
were treated with ribonuclease H (Life Technologies), purified by a
Sephadex G-50 spin column (Boehringer Manheim) and then ethanol
precipitation. To make templates for PCR, we adopted the single strand
ligation to single-stranded cDNA method (26) using a 5
-AmpliFINDER
RACE kit (Clontech) following the manufacturer's instruction. The RACE
methods (27) were carried out with Taq and/or Pfu
(Stratagene) polymerases. Representative conditions were as follows.
The reaction mixture contained 2 mM each of an
anchor-specific primer (5
-CCTCTGAAGGTTCCAGAATCGATAG-3
) and
gene-specific primer 2 (5
-ATCTGGCTTGATGGCAATGAAGGTAC-3
, +42 to +17),
or 6L (5
-AGAAGCAAGAAGTGTAGTCGATG-3
,
9 to
31), 10%
Me2SO, 0.2 mM dNTP, and 2.5 units of
Pfu polymerase in a low magnesium buffer (Idaho Technology).
Ten µl of the mixture in a glass tube was incubated at 97 °C for
60 s, 52 °C for 10 s, 75 °C for 120 s for 40 cycles, and additionally at 75 °C for 10 min in an Idaho Technology
1605 air thermocycler. A 5-µl aliquot of each reaction was applied to
a 2.5% agarose gel (see Fig. 3, A and B). The
residual 1-µl aliquot was used as a template for a second PCR in the
same reaction mixture described above but containing Taq
instead of Pfu polymerase. The second reaction was carried
out at 95 °C for 60 s, 52 °C for 10 s, 72 °C for
120 s for 10 cycles, and additionally at 72 °C for 10 min. The
PCR products were subjected to direct subcloning using a TA cloning kit
(Invitrogen). The established clones were sequenced using a Sequenase
sequencing kit (U. S. Biochemical Corp.).
Plasmid Constructions
Full-length cDNAsThe full-length cDNA clones were
constructed with the 5-RACE products described above and cDNA
clones for the
isoform previously reported (20). We used two RACE
clones, 50B and 28H, as representatives of group 1 type
and group 2 type
, respectively. They were cut out from the vectors by
EcoRI and then digested with XhoI. The cap site
to XhoI site fragments from the RACE products and the
XhoI site to poly(A) fragments from cDNA were ligated.
The reconstructed fragments were subcloned into pBluescripts
(Stratagene) and/or pGEM (Promega) plasmids. A full-length cDNA for
the
isoform was also constructed and used for a control (13; data
not shown).
Several expression plasmids were constructed in
the CAT gene containing plasmid pKK232-8 (Pharmacia). Six sense
primers, 239 (5-
GGGTACCCCAGAGCAGAGAGT-3
), 238 (5
-
TCCCAGGGAAAGGTGAATGCAGATG-3
), 237 (5
-
TCCTGCCTCACAGCCCTCCGT-3
), 227 (5
-
TCGCTCTCCGCTGGCACCAGCC-3
), 236 (5
-CTCAGG(
)TCCCGCGGTCTCCTTTC-3
), and 228 (5
-AAA(
)ATCTGGAAAGCCACGTGTGTCCT-3
), and one
antisense primer, 235 (5
-
TGCAGAAGCAAGAAGTGTAGTCGA-3
), were used to amplify seven overlapping genomic fragments of the 5
-regulatory region for the
isoform. To subclone the PCR fragments into the BamHI-HindIII cloning site of the CAT
vector, underlined sequences were added to or altered from
gene-specific fragments to generate appropriate restriction enzyme
sites denoted by boldface letters. Standard PCRs were performed to
amplify the gene fragments, and then the products were digested with
BamHI or Sau3AI or BglII and
HindIII and then subcloned into pKK232-8.
The constructs used in this study were made as follows: 1)
prNDPK(
116)CAT, 228 × 235/BglII and
HindIII/
116 to
7; 2) prNDPK
(
158)CAT, 236 × 235/BamHI and HindIII/
158 to
7; 3)
prNDPK
(
243)CAT, 227 × 235/Sau3AI and
HindIII/
243 to
7; 4) prNDPK
(
366)CAT, 237 × 235/Sau3AI and HindIII/
366 to
7; 5)
prNDPK
(
452)CAT, 237 × 235/BamHI and
HindIII/
452 to
7; 6) prNDPK
(
568)CAT, 238 × 235/BamHI and HindIII/
566 to
7; and 7)
prNDPK
(
606)CAT, 239 × 235/BamHI and
HindIII/
606 to
7.
In Situ Hybridization
Rat organ specimens embedded in Optimum Cutting Temperature
compound (Miles Laboratories) were frozen in liquid nitrogen. In
situ hybridization of frozen sections (8 µm) was performed following the procedure described elsewhere (28). We used three kinds
of riboprobes labeled with digoxigenin-11-UTP (Boehringer Mannheim).
They were complementary to: 1) the 5-untranslated region of the group
1 type
mRNA (at position
206 to
384); 2) the
5
-untranslated region of the group 2 type
mRNA (at position
4 to
85); and 3) the 5
-untranslated region for
mRNA. The signals were detected using a digoxigenin detection kit (Boehringer Mannheim).
Cell Lines, Transfection, and CAT Assay
Rat fibroblast cell line 3Y1 and an osteosarcoma cell line, UMR106, (29) were maintained in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum. Transfections were performed by the lipofection method using N-[1-(2,3-dioleoxyloxy)propyl]-N,N,N-trimethylammonium methylsulfate (Boehringer Manheim), following the manufacturer's instruction. Briefly, cells at a 50-60% confluent state in a dish 10 cm in diameter were transfected with a total of 5 µg of DNA and incubated for 12 h, and then the medium was replaced with fresh medium and incubated for a further 48 h. Cells were harvested, the cell lysates were adjusted to equivalent protein concentrations, and CAT activity was measured following a standard method (30). The Rous sarcoma virus-CAT plasmid and purified CAT enzyme were used for positive controls. The conversion rate was calculated by using Quantityone software (PDI Imageware Systems, Huntington Station, NY).
In Vitro Translation Assay
Synthetic RNAs were generated from the template plasmids containing either full-length cDNA described above. These synthetic RNAs were capped by using the mRNA capping kit (mCAPTM; Stratagene), and then 1 µg of each synthetic RNA was subjected to an in vitro translation assay using the ECL in vitro translation system (Amersham Corp.). SDS-polyacrylamide gel electrophoresis aliquots of the biotinylated translation products were electrophoretically blotted onto a polyvinylidene difluoride membrane (Bio-Rad) and detected using streptavidin conjugated with horseradish peroxidase and an ECL reagent (Amersham). Densitometric quantification was carried out by using PDQuest software (PDI).
Reagents and Chemicals
Unless otherwise indicated, restriction enzymes and modified enzymes used in this study were purchased from Toyobo (Osaka, Japan).
To determine the
transcription initiation sites of the two types of isoform
mRNA, we performed RNase protection assays using two kinds of
synthetic RNA probes,
-l and
-s. When we
used
-l, which is complementary to the first exon and a
further upstream region of the rat NDP kinase
gene from
209 to
453, extraordinarily numerous protected bands were observed between
the positions 185 and 30 judging from the DNA sequencing ladder (Fig.
2A). The major protected bands migrated
between positions 185 and 150, 100 and 85, and 60 and 30. On the other
hand, the
-s probe, which is complementary to the first
intron of the gene from
9 to
204, again provided multiple protected
bands between positions 100 and 40 (Fig. 2B). Interestingly,
protected band patterns of the
-l probe by the total RNA
from the liver and heart were essentially identical, although some
subtle differences of the signal intensity were observed (Fig.
2A, lanes 1 and 2). The probe
-s
was also protected by the liver total RNA in the same way as observed
for the heart RNA (Fig. 2B, lane 1) with a weaker intensity
(data not shown). Under the reaction conditions used, the positive
control had a single dominant band, and the negative control had no
measurable background, leading to the conclusion that reaction
conditions were optimized. However, there existed slight ambiguities in
the data obtained; synthetic RNA samples used as positive controls provided longer protected fragments than expected by 6 bases. Such
differences could be ascribed in part to the fact that each full-length
synthetic RNA possessed an artificially added linker sequence at the
5
-end of the genuine sequence. In addition to that possibility, RNA
reportedly has slightly slower mobility compared with a DNA molecule of
similar size. Thus we tentatively speculate that the major
transcription initiation sites are located between positions
50 and
105,
240 and
270,
295 and
315, and
365 and
390. To
confirm the data, we used other methods as described below.
To identify the transcription
initiation site of the rat NDP kinase isoform, the 5
-RACE method
was performed. When we compared amplification efficiency of
Pfu with the most conventional thermostable polymerase,
Taq, under the same conditions, Pfu generated
longer products with higher efficiency than Taq (Fig.
3A). Therefore, we performed PCR using
Pfu under relatively high stringent conditions thereafter.
Representative crude PCR products obtained with primer 2 and anchor
primers are shown in Fig. 3B. Considering the anchor primer
size (25 mer), the PCR by primer 6L and the anchor primer generated a
broad band ranging from 50 to 130 bp and around 200 bp in net length,
and PCR by primer 2 and the anchor primer produced a broad band from
around 100 to 250-bp. These profiles of the gross PCR products
generated with Pfu polymerase agreed well with the gross
pattern of the transcription start sites predicted by the RNase
protection assays. A number of the PCR products generated with primer 2 were subcloned and verified by sequencing. All of the sequenced clones
were found to code for the genuine NDP kinase gene segments without
exception. Interestingly, most of the clones appeared to have an
additional guanine nucleotide at their 5
-ends. Representative clones
are shown in Fig. 3, C and D. Recently, Hirzmann
et al. (31) have reported that the reverse transcriptase can
occasionally read the 5
-cap structure as G during cDNA synthesis, resulting in the addition of a G residue to about half of their RACE
clones (31). In our study, 29 of 34 (85%) informative clones possessed
a G residue at the 5
-end. Furthermore, 21 of the 29 clones had an
unpaired G residue, suggesting that these clones are generated from the
full-length RNA templates with the cap-G structure (summarized in Fig.
1). The higher incidence of the reverse transcription of the cap-G
residue in our experiments reflects not only high integrity of our
template mRNA but also reliability of the techniques we used. The
location of the 5
-termini of the NDP kinase
mRNA identified by
the RACE method agreed with most of the putative major transcription
initiation sites suggested by the RNase protection assay within a
couple of nucleotides (e.g.
50,
58,
78,
85,
250,
252,
257,
290,
299,
300,
366,
368,
380, and
384).
Accumulated data of the RACE clones further verified the existence of
the two different types of the transcripts for the NDP kinase
isoform. All the clones except one (clone 20H), which started further
upstream than position
241 (total, 12 clones), were spliced at
206
and accepted at
4 and categorized as the group 1 type mRNA. On
the contrary, all clones initiated at any points downstream from
position
206 were not spliced and bore various sizes of the
5
-untranslated stretch continuing from the cap sites to the identical
translation initiation site at position +1. These clones were
categorized as the group 2 type. The splicing donor and acceptor sites
do not merely obey the Chambon rule, but the adjacent sequence fits conservative consensus sequences (32). The exceptional clone 20H could
have been generated from a nascent unprocessed mRNA template.
In the course of this study we could not find any typical TATA boxes but found heavily GC-rich stretches alongside of the expected binding sites of the basic transcriptional machinery.
The Group 1 and 2 Types ofConsidering the huge heterogeneity of the transcription
initiation sites for both the group 1 and group 2 type mRNA, it would be very difficult to quantify them as they are. Therefore, we
tried to evaluate the two groups of
isoform transcripts by RNase
protection assay using relatively short RNA probes, which were expected
to cover most of the transcripts at their common trunks. The RNA probes
-1 and
-2, schematized in Fig. 4, were accordingly
used. When the probe
-1 is used, the group 1 and group 2 type
mRNA are expected to produce 85-nucleotide (Fr 1) and 46-nucleotide
(Fr 2) protected fragments, respectively. Synthetic standard RNA
samples representing the group 1 and group 2 mRNA actually produced
such protected fragments, although the protected sizes were somewhat
longer than the expected ones, due to complementarity of the adjacent
vector sequences to the probe. Total RNA samples extracted from adult
rat tissues, including cerebrum, spleen, heart, lung, liver, kidney,
small intestine, testis, and skeletal muscle, and two cell lines,
UMR106 and 3Y1, were examined and found to produce two to three major
protected bands that were concordant with the signals corresponding to
the Fr 1 and Fr 2 of the standard samples. The Fr 1 signals
corresponding to the group 1 type
mRNA were constantly
expressed in these samples except those from the cell lines, in which
the signal was extremely increased. Of interest was the finding that a
decreased amount of the group 1 mRNA, the dominant form, was
observed in a highly metastatic rat mammary adenocarcinoma cell line
(MTLn3) compared with a low metastatic sib line (MTC) (data not shown; see ref. 33). Although the signal corresponding to the group 2 mRNA
(Fr 2) was strong in the heart and skeletal muscle, it was weak in the
liver and kidney and in only trace amounts in other samples, including
cultured cell lines (Fig. 4A). Similar experiments were
performed using probe
-2 (Fig. 4B) and provided "mirror
image" protected fragments compared with those done using probe
-1. Furthermore, this probe demonstrated that there exist no
remarkable initiation sites in the proximal region between position
24 and
4. Thus, the previously reported predominant transcription
start site (position
3 in ref. 23) should be redefined as a splicing
acceptor site (correctly assigned to position
4) of the second exon
for the group 1 type mRNA.
To analyze the expression of these transcripts at the
cell level, we performed in situ hybridization using
specific probes for group 1 type , group 2 type
, and
.
Generally, the group 1 type
was ubiquitously expressed, and the
expression was more intense than for the group 2 type
, as observed
by the RNase protection analyses. It is worth noting that both types of
mRNA coexisted in some kinds of cells, whereas there was no
cell type that exhibited strong group 2 mRNA expression alone. One
of the representative cases, the stomach of adult rat, is shown in Fig. 5. A large amount of the group 1 type
mRNA was
expressed in both gastric pit epithelial cells and fundic gland cells,
whereas the group 2 type
mRNA was expressed strongly only in
the chief cells in the fundic glands, the pattern being similar to that for the
mRNA.
Universally Active Core Promoter Region of the
To identify promoter activity of the isoform gene, we
made several CAT constructs (schematized in Fig. 6) and
transfected them into UMR106 cells. The strongest CAT activity was
observed in the cell lysates prepared from the cells transfected with
CAT constructs 7 and 6. A reduced but countable amount of the activity was detected in the cell lysate transfected with plasmid 5, and only a
trace amount was detected in the lysate from the transformant with
construct 4. No significant CAT activity was detected in the lysates of
the cells transfected with plasmids containing shorter genomic
fragments than clone 3. Essentially similar results were obtained when
we used a fibroblast cell line, 3Y1, as a host (data not shown). The
data indicate that one of the strongest core promoter activities seems
to reside in the region between
568 and
452, where the most distal
putative Sp1 binding consensus sequence GT box, an aberrant form of the
GC box, is located (34). Weak but significant promoter activity may be
present in the region between
452 and
366 in which another GC box
is included. The proximal region from position
366 has no significant
promoter activity in these cells. It should be noted that the region
(between
568 and
452) that provided the strongest promoter activity
corresponds to the most distal universally active transcription
initiation sites. However, this region might be too far from downstream
initiation sites for the group 2 type
mRNA to operate. Thus, it
could be reasonable to postulate another promoter region for the
downstream transcription initiation sites. Unfortunately, however,
because of the trace amount of the group 2 type
transcripts in the
cells used as the host (Fig. 4), it would be very difficult to detect downstream promoter activities in these cells.
A Group 2 type
Although the numerous
different forms of transcripts for the isoform have been defined,
the physiological meaning of them is largely unknown. We have examined
several possibilities, including structural polymorphism at the peptide
level for both types of the
gene products. But so far these
attempts have been unsuccessful, except in vitro translation
analyses. The in vitro translation analysis using synthetic
RNAs generated from each representative of the group 1 and 2 types of
cDNA clones and the
cDNA clone revealed remarkable
differences in translation efficiency: the group 2 type
RNA was
most effectively translated among them, the group 1 type
RNA was
minimal; and the
RNA was intermediate. Relative efficiency rates
were approximately 100, 14, and 36%, respectively (Fig.
7).
The prevailing notion that NDP kinase is an essential enzyme for nucleotide metabolism in the cell has been proved in recent studies that have included the cloning of cDNA, in which transcripts coding for the enzyme from various organisms were found to be highly conserved. The compiled data promptly led to the realization that a single ancestral gene of NDP kinase could have been conserved from unicellular organisms to mammalian species. On the other hand, the discovery of nm23 (2) and its identification as an NDP kinase have opened a new era in NDP kinase research. As a consequence of the findings that the NDP kinase molecules may carry out multiple functions besides the enzymatic activity, as in the case of PuF (6, 9) and inhibiting factor (5, 8), it becomes apparent that NDP kinase may play a pivotal, multifunctional role in various organisms. Furthermore, based on compiled observations, it can be envisioned that the expression of NDP kinase is governed by binary regulatory mechanisms, that is, inducible and constitutive ones.
To get a bird's eye view of the NDP kinase genes in terms of evolutional divergence of their organization and function, we examined the exon-intron organization of the genes of various origins as a genomic evolution parameter. The intron numbers have increased as the species evolution has proceeded: 1) unicellular organisms such as Escherichia coli and Saccharomyces cerevisiae possess one undivided gene (35, 36); 2) D. melanogaster also has a single but separated gene that is divided by an intron (37, 38); 3) a slime mold D. discoideum bears two genes coding for cytosolic and mitochondrial type enzymes, both of which have two intervening sequences at identical locations with an additional two introns for the latter gene (39, 40); and 4) the mammalian species examined have two isoforms of NDP kinase. The two rat genes have four introns at completely conserved locations (13). The partial human gene structures for one isoform (nm23-H1) reported from a couple of laboratories are similar but not identical (41, 42); one shows exactly the same gene organization as the rat gene, whereas the other does not. Since the awd locus of the fruit fly is coding for a single NDP kinase gene that has one intronic sequence exactly corresponding to the location of the second intron of the rat genes, it is reasonable to suppose that the awd gene might represent an archetype of the mammalian NDP kinase gene, and duplication of the ancestral gene could have occurred after the arthropod and chordate ancestor's divergence. On the other hand, two NDP kinase genes of slime mold contain two intervening sequences, the locations of which are concordantly preserved between them but completely discordant with those of the mammalian species, making it likely that a single ancestral gene obtained species-specific intervening sequences after the phylum divergence from a common ancestor of the animal, then was duplicated and developed to the present forms.
Second, we compared CpG islands of the fly and rat NDP kinase genes to
characterize the gross outline of the 5-regulatory regions. The
awd gene has a characteristic CpG island that covers continuously the 5
-flanking to the 5
-untranslated region (nine CpG
islands in 115 bp) and the coding region (40 CpG islands in 459-bp). In
the case of the rat
gene, CG dinucleotides were distributed from
the 5
-regulatory region to the second exon (50 CpG islands in 561 bp,
between
435 and +126). On the contrary, the
gene has a sparsely
distributed CpG cluster that covers the first exon and disappears in
the 5
-portion of the first intron (data not shown). It follows,
therefore, that the
gene may be representative of the direct
descendant of the archetype NDP kinase gene.
Although structures and regulatory mechanisms of NDP kinase genes of
other mammalian species are largely unknown, comparison of the two
human cDNA clones, nm23-H2 and PuF, allow us to speculate the
existence of the two groups of transcripts as observed for the rat NDP
kinase isoform gene: 1) nm23-H2 and PuF have
the same coding sequence, demonstrating that they are products of an
identical gene; 2) the PuF clone has an AG sequence, a candidate splicing acceptor site, at position
5, whereas the nm23-H2 clone does
not have the dinucleotide and contains a completely different 5
-untranslated sequence; and 3) the PuF clone has multiple CG dinucleotides in the short 5
-untranslated stretch, which may represent
part of a CpG island. These data suggest that the nm23-H2 clone may be
representative of the rat group 1 type
homologue, whereas the PuF
clone may belong to the group 2 type
homologue.
Heterogeneous transcription initiation can be interpreted mainly from
two biological aspects; different transcription initiation sites: 1)
make different translated products (peptides), and 2) are regulated
under distinct elements. Regarding the rat NDP kinase isoforms, despite
the huge heterogeneity of the 5-regions of the transcripts, there
exist stop codons between an upstream ATG and the authentic initiation
codon of the adequate open reading frame of the enzyme, thus ruling out
the former possibility. On the other hand, our RNase protection
analyses demonstrated characteristic differences in the expression of
the group 1 and 2 types of NDP kinase
mRNA in tissues and cell
lines. The upstream regulatory region may play an essential role in
operating constitutive and predominant expression of the group 1 type
mRNA in most cells, whereas the downstream regulatory region
may be responsible for high expression of group 2 type
mRNA in
certain tissues or cells such as muscles. Furthermore, the observations
that the group 1 type
mRNA, the major component of the rat NDP
kinase mRNA, was increased on immortalization of the cells (Fig. 4)
and decreased in a subline with higher metastatic potential (33) imply
possible roles for the upstream region in tumorigenic processes.
In situ hybridization analyses provided informative data in
support of the idea of the basic mechanisms of the gene expression
obtained by the RNase protection assay: 1) predominant expression of
the group 1 mRNA was confirmed in most cells examined; 2) the group 2 mRNA was expressed in some of these cells; in other words, there were no cell types that exhibited the group 2 mRNA alone; and 3) it
should be noted that in some cells the group 1 mRNA was almost
exclusively expressed. Considering the CAT assay data (Fig. 6) in
combination with these observations, the core promoter activity localized in the region between
568 and
452 might be necessary not
only for the expression of the group 1 type transcripts but also
essential for the expression of the group 2 type ones that initiate far
downstream from the candidate general promoter region. It follows that,
in cooperation with the universal core promoter activity, additional
promoter activities could be generated by downstream
cis-elements and trans-acting factors, such as
putative E box consensus sequences (see Fig. 1) and the muscle-specific transcription factors (reviewed in Ref. 43) for the preferential induction of the group 2 transcripts in muscle cells.
Regarding the translational control mechanisms (reviewed in Refs. 21
and 22), there have been no definitive data indicating regulatory
mechanisms of NDP kinase expression at the posttranscriptional steps.
The data presented in this study suggest the possible biological significance of the multiple forms of the 5-untranslated region; the
group 1 type
mRNA, which is constitutively expressed, may work
for an "idling" state at a minimal translating rate, whereas the
"induced" form (the group 2 type mRNA) could be used at a higher rate in case of need. Hence these differential translation rates
seem to guarantee homeostasis in maintaining triphosphate nucleotide
pools in response to different circumstances. The predicted conservation of these multiple forms of NDP kinase mRNA for the
isoform among mammalian species could also give a rationale for its
biological significance. Recently, the meaning of the multiple
transcription initiation sites has been reported in the case of an
E. coli enzyme, phosphoenolpyruvate (44). The heterogeneity of the 5
-untranslated region provides the mRNA with distinct stabilities; whereas the constitutively synthesized transcript is
stable, the stimulus-induced one is rather labile. It follows that the
two types of mRNA expression enable a quick adaptation by executing
fine control of the enzyme amount, resulting in maintenance of
metabolic homeostasis.
Although this study indicated that heterogeneity of the 5-untranslated
region may determine the translational efficiency for rat NDP kinase
mRNA, the question of how one or some of the multiple
transcriptional initiation sites can be chosen in response to various
milieus is crucially important for understanding the posttranscriptional as well as transcriptional regulation. We anticipate that our observations and concepts on NDP kinase gene expression will shed light not merely on the enzyme expression per se but also on the housekeeping enzymes in general.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) D89068[GenBank].
We thank Dr. Tosifusa Toda for computer analyses, Dr. Renzo Hirayama for heartfelt enthusiasm, and Dr. Lance A. Liotta for critical reviewing the manuscript and helpful discussions.