(Received for publication, March 4, 1997)
From the Departments of Medicine, and Anatomy & Cell Biology, Columbia University College of Physicians and Surgeons, New York, New York 10032
Phosphatidylcholine (PC) is the most abundant
eukaryotic phospholipid and serves critical structural and
cell-signaling functions. CTP:phosphocholine cytidylyltransferase (CT)
is the rate-limiting enzyme in the CDP-choline pathway of PC
biosynthesis, which is utilized by all tissues and is the sole or major
PC biosynthetic pathway in all non-hepatic cells. Herein, we present
the complete structure of the murine CT (Ctpct) gene. One
P1 genomic clone and six subsequent plasmid subclones were isolated and
analyzed for the exon-intron organization of the Ctpct
gene. The gene spans approximately 26 kilobases and is composed of 9 exons and 8 introns. The exons match the distinct functional domains of
the CT enzyme: exon 1 is untranslated; exon 2 codes for the nuclear
localization signal domain; exons 4-7 encompass the catalytic domain;
exon 8 codes for the -helical membrane-binding domain; and exon 9 includes the C-terminal phosphorylation domain. Two transcriptional initiation sites, spaced 35 nucleotides apart, were identified using
5
-rapid amplification of cDNA ends polymerase chain reaction. The
5
natural flanking region was found to lack TATA or CAAT boxes and to
contain GC-rich regions, which are features typical of promoters of
housekeeping genes. Several sites that have the potential to interact
with transcription regulatory factors, such as Sp1, AP1, AP2, AP3, Y1,
and TFIIIA, were identified in the 5
-region of the gene and found to
be distributed in two distinct clusters. These data will provide the
basis for future studies on the cis- and
trans-acting factors involved in Ctpct gene
transcription and for the creation of induced mutant mouse models of
altered CT activity.
Synthesis of the most abundant eukaryotic phospholipid, phosphatidylcholine (PC),1 which serves critical structural and cell-signaling functions, involves three major enzymatic steps: phosphorylation of choline, synthesis of CDP-choline from choline-phosphate and CTP, and transfer of choline-phosphate from CDP-choline to diacylglycerol to form PC (1, 2). This pathway, called the CDP-choline, or Kennedy, pathway, is the major or sole one used by all extrahepatic tissues. PC biosynthesis in hepatocytes also occurs via this pathway, but an alternative one in which phosphatidylethanolamine is converted to PC by phosphatidylethanolamine N-methyltransferase (PEMT) is also used by this cell type (1, 2). The rate-limiting step in the Kennedy pathway is the synthesis of CDP-choline, which is catalyzed by the enzyme CTP:phosphocholine cytidylyltransferase (CT). CT cDNAs from several different species, including rat (3), mouse (4, 5), hamster (6), and human (5), have been cloned and sequenced. All of these CT cDNAs encode a CT protein of 367 amino acids, and the sequences are highly homologous among the different species.
CT exists in both soluble and nonintegral membrane-bound forms and is subject to both pre-translational (7-9) and post-translational regulation (1, 2). Post-translational regulation may involve binding of CT to membranes in cells as well as changes in C-terminal phosphorylation. In addition, CT mRNA levels increase with growth factor stimulation of certain cells (7), after partial hepatectomy in rat liver (9), and during differentiation, and there is a decrease in CT mRNA after overexpression of PEMT2 in hepatoma cells (8). Whether the changes in mRNA levels under these conditions are due to changes in CT gene transcription or to changes in CT mRNA stability (cf. Ref. 7) has not been fully investigated. Importantly, all of these regulatory studies have been conducted in vitro or in cultured cells, and, in a few cases, results obtained from different laboratories have been contradictory. Thus, the physiological regulation of CT activity, particularly in vivo, is an important area for further investigation.
Our laboratory has recently become interested in the physiology and
regulation of CT during atherogenesis (10-12). We found that free
cholesterol loading of macrophages, an important event in advanced
atherosclerosis, leads to the induction of CT activity, PC
biosynthesis, and PC mass (10, 11). This response helps the macrophages
adapt to potentially toxic levels of cellular free cholesterol, and
failure of this response may be one cause of an important lesional
event, namely macrophage necrosis (12). To test these ideas in
vivo using induced mutant mouse models, it became necessary for us
to determine the structure of the murine CT, or Ctpct, gene.
This information should also be useful in addressing some of the
uncertainties regarding pre- and post-translational CT regulation
described above. Given the central importance of CT, it is surprising
that only a small part of the structure of the Ctpct gene
has been reported thus far (4). Herein, we present the complete
structure of the murine Ctpct gene, which reveals a
relationship between exon organization and functional domains of CT,
the existence of two transcriptional initiation sites, and the presence
of several potential 5-upstream cis-elements that may be
involved in gene transcription.
All chemical reagents were purchased from either
Sigma or Fisher. All restriction endonucleases and other enzymes were
from New England Biolabs (Beverly, MA). The [-32P]dCTP
was purchased from DuPont NEN. The random primer labeling kit, the 5
-
and 3
-RACE PCR kits, and all synthesized primers were from Life
Technologies, Inc. The DNA preparation kit was obtained from Qiagen
(Chatsworth, CA), and Sequenase Version 2.0 DNA sequencing
kit was from U. S. Biochemical Corp. The Taq DNA polymerase
was from Perkin-Elmer. QuickHyb solution and Taq extender PCR additive were from Stratagene (La Jolla, CA). RNA Zol B was from
Tel-Test "B", Inc. (Friendswood, TX), and the TA cloning kit was
from Invitrogen (San Diego, CA).
A PCR product amplified from rat CT cDNA (generously
supplied by Dr. R. B. Cornell, Simon Fraser University, Canada) by a set of primers (Nos. 820 and 976 as shown in Table I) was used for
screening a mouse 129/J stem cell genomic DNA library in a P1 vector
(Genome Systems, Inc., St. Louis, MO). Four P1 clones (Nos. 4901, 4902, 4903, and 4904) containing the Ctpct gene were obtained.
These four P1 clones were further confirmed by Southern blot analysis
using three PCR products as probes in QuickHyb solution (Stratagene)
following the protocol of the manufacturer. The three probes, which
were located in the 5-, internal, and the 3
-regions of the CT
cDNA, were produced by PCR using primers Nos. 137 and 305 (5
), 820 and 976 (internal), and 1003 and 1177-A (3
), respectively, as shown in
Table I. The three probes were labeled with
-32P using a
random primer labeling kit (Life Technologies, Inc.) following the
protocol of the manufacturer. One of the P1 clones, No. 4904, contained
the entire Ctpct gene (see Fig. 1) based on the results of
the Southern blot. Clone No. 4904 was then digested by either
EcoRI or PstI and subcloned into pBluescript
KS+ vector, and six subclones containing Ctpct
gene fragments were identified by Southern blot using CT cDNA as
the probe.
|
Identification of the Structure of the Murine Ctpct Gene
The six subclones mentioned above were sequenced for
determining exon sequences and exon-intron boundaries. In addition, the 5-upstream region of clone 6 was sequenced to identify potential cis-elements involved in binding transcriptional factors.
DNA was prepared using the Qiagen DNA preparation kit following the protocol of the manufacturer. DNA sequencing was performed either automatically using an automated sequencer (Applied
Biosystems/Perkin-Elmer model 373A) in the DNA Core facility of
Columbia University or manually by the dideoxynucleotide chain
termination method using Sequenase Version 2.0 DNA sequencing reagents
(U. S. Biochemical Corp.) following the protocol of the manufacturer.
The primers used for DNA sequencing are listed in Table I. Sequences
were analyzed using the Wisconsin GCG software package (13). Exon sequences and exon-intron boundaries were defined by comparison with
both murine CT cDNA (4, 5) and rat CT cDNA (3). The intron
sizes were defined by PCR using P1 clone 4904 as template and the
primers listed in Table I. For introns less than 3 kb, the following
PCR condition was used: initial cycle of 94 °C for 1 min, 50 °C
for 1 min, and 72 °C for 2 min; followed by 30 cycles of 94 °C
for 30 s, 50 °C for 30 s, and 72 °C for 2 min; plus a final extension of 72 °C for 6 min. For introns larger than 3 kb,
Taq extender PCR additive (Stratagene) was added to the PCR reaction following the instructions of the manufacturer, and the PCR
condition was adjusted as follows: initial cycle of 94 °C for 1 min,
50 °C for 1 min, and 72 °C for 10 min; followed by 30 cycles of
94 °C for 30 s, 50 °C for 30 s, and 72 °C for 8 min; plus a final extension of 72 °C for 10 min.
5-RACE (or "anchored") PCR was conducted to determine
the transcriptional initiation sites of the murine Ctpct
gene, using a 5
-RACE PCR kit (Life Technologies) according to the
protocol of the manufacturer. Briefly, total RNA was isolated from the livers of 10-week-old female 129/J mice using RNA Zol B following the
instructions of the manufacturer. RT-PCR was performed using superScript reverse transcriptase, primer No. 382 (Table I), and the
above RNA (as template), as shown in Fig. 2A. The resulting first strand of 5
CT cDNA fragment was purified using a GlassMax spin cartridge. Then, oligo-dC was added to the 5
-end by terminal deoxynucleotidyl transferase. Finally, PCR was performed using the
5
-tailed CT cDNA fragment as template together with primer No. 305 (Table I) and an abridged anchor primer
(5
-GGCCACGCGTCGACTAGTACGGGIIGGGIIGGGIIG-3
). The resulting PCR
products were probed by Southern blot with the 32P-labeled
5
-CT cDNA PCR product (using primer Nos. 137 and 305) employed
previously for screening the 5
-end of the genomic clones (see above).
PCR products that hybridized to this 5
-probe were cloned into pCR2.1
vector using a TA cloning kit (Invitrogen) following the protocol of
the manufacturer. The CT transcriptional initiation sites were
identified by sequencing of the above clones using primer No. 305.
Determination of the 3
3-RACE PCR was
employed to determine the 3
-UTR of CT cDNA using a 3
-RACE PCR kit
(Life Technologies) following the protocol of the manufacturer.
Briefly, total RNA was isolated as described above. RT-PCR was
performed using an adapter primer
(5
-GGCCACGCGTCGACTAGTACTTTTTTTTTTTTTTTTT-3
). The resulting
first strand of CT cDNA was used as template for further PCR using
the primer No. 1177-S (Table I) and the adapter primer.
The PCR product was cloned into pCR2.1 vector and sequenced.
As described in detail under "Experimental
Procedures," four clones were isolated from a murine 129/J stem cell
genomic DNA library in the P1 vector, and one of these genomic clones
(No. 4904) was found to contain the entire Ctpct gene by
Southern blot using various CT cDNA probes (Fig. 1).
Clone No. 4904 was digested by either EcoRI or
PstI and subcloned into pBluescript KS+ vector.
Six subclones encompassing parts of the Ctpct gene were identified by Southern blot analysis, and these subclones were sequenced. As shown in Fig. 1, clone 1 (2 kb) contains the 3 portion
of intron I and the 5
portion of exon 2; clone 2 (3 kb) contains the
3
portion of intron I, all of exon 2, and the 5
portion of intron II;
clone 3 (4 kb) contains the 3
portion of intron II, all of exon 3, and
the 5
portion of intron III; clone 4 (12 kb) covers the 3
portion of
intron III, all of exons 4-8 and introns IV-VII, and the 5
portion of
intron VIII; clone 5 (4 kb) contains the 3
portion of intron VIII, all
of exon 9, and the 3
-NFR; and clone 6 contains the 5
portion of
intron I, all of exon 1, and the 5
-NFR.
The Ctpct gene is approximately 26 kb in length, which is
~17 times the length of the CT cDNA. The gene is composed of 9 exons interrupted by 8 introns. Exon 1 contains the 5-UTR with
interruption by intron I at 10 base pairs upstream of the ATG start
codon, which is in exon 2; exon 9 contains the 3
-UTR (Figs. 1 and 4). The sizes of the exons range from 72 to 548 bp (Table
II), and the sizes of the introns, which were estimated by PCR
amplification using a pair of primers located on flanking exons (see
Table I), range from 0.5 kb to 6 kb (Fig. 1 and Table
III). All exon-intron boundaries were sequenced and are listed in Table
III. The boundary sequences at the 5
- and 3
-ends of all of the
introns are GT and AG, respectively (Table III), which are consensus
sequences for pre-mRNA splicing recognition donor and acceptor
sites (14). As described in detail under "Discussion" and depicted
in Fig. 1, the organization of the exons of the Ctpct gene
are related to the distinct functional domains of the CT enzyme.
|
|
The transcriptional
initiation sites of the Ctpct gene were determined by
5-RACE PCR. PCR was performed using a 5
-fragment of 129/J murine
liver CT cDNA tailed with oligo-dC as template; the set of primers
used were an abridged anchor primer and primer No. 305 (Table I and
Fig. 2A). The reaction generated products of
two distinct sizes, 400 and 370 bp, as shown in Fig. 2B.
Southern blots of these PCR products hybridized with a 5
-CT cDNA
probe (Fig. 2B), indicating that both products were 5
-CT
cDNA fragments with different lengths. 5
-RACE PCR was repeated
using a different preparation of total RNA from 129/J mouse liver, and
the result was the same. Therefore, there appear to be two
transcriptional initiation sites utilized in the Ctpct gene.
To identify these sites, the 400- and 370-bp PCR products were each
cloned into pCR2.1 vector and sequenced using primer No. 305 (Table I).
As shown in Fig. 3, one of the start sites is 82 nucleotides upstream of the ATG start codon and marked as position +1,
and the other is 47 nucleotides upstream of the ATG and marked as
position +35. Both start with A, a purine, the most common
transcription-initiating nucleotide.
Clone 6 (Fig. 1), which contained a 4-kb PstI digestion
fragment from the Ctpct gene, was identified by Southern
blot using exon 1 as a probe. Clone 6 contains approximately 2.6 kb of
the 5-NFR of the Ctpct gene, all of exon 1 (72 bp), and
approximately 1.4 kb of intron I. We sequenced 600 bp of DNA
immediately upstream of the first transcription start codon (position
+1). In analyzing this putative promoter region of the Ctpct
gene (Fig. 3), we found characteristics that are typical of
housekeeping gene promoters, such as no TATA or CAAT boxes and a high G + C content (see "Discussion" and Refs. 15-17). Using a computer
analysis of the 5
-NFR, a number of potential transcription
factor-binding sites were revealed, as indicated in Fig. 3. A total of
five potential Sp1-binding sites (GC boxes) (18) were located at
positions
9,
58,
66,
70, and
144; the sites at
66 and
70
are overlapping. An AP1 site (ATGAGTCAA) was located at position
350,
an AP2 site (GCCGGCGGG) was located at position
324, and an AP3 site
(TGTGGTTT) at position
107. Two TFIIIA sites (CGGGCTCGAA and
CAGGTCGGAA) were located at positions
319 and
381. Two reversed Y1
sites (AGAGGGCGGG and AGCCGGCGGG) were located at positions
73 and
325; the first overlaps with an Sp1 site, and the second overlaps
with AP2 and TFIIIA sites.
3-RACE PCR (see "Experimental Procedures" and
Fig. 4A) yielded a 370-bp product whose
sequence is shown in Fig. 4B. The sequence from the stop
codon (TAA) to the terminal adenine, which is the site where
polyadenylation occurs, is 341 bp (3
-UTR). Although a 3
-terminal
adenine is typical (19), an unusual polyadenylation signal sequence
(AATATA) was located 14 bp upstream of the site where the poly(A) tail
is added during the maturation of CT mRNA (see Ref.
(19).2 As depicted in Fig. 4C,
the 3
-UTR was located on exon 9 of the Ctpct gene
(i.e. without an intervening intron) since the sequence of
the 3
-RACE PCR product (above) was identical to the sequence of the
3
-end of exon 9 of the Ctpct gene (from clone 5 in Fig. 1).
The conserved sequence (TTTTTT), which is one of several possible downstream signals required for efficient eukaryotic mRNA
polyadenylation (20), was located in the 3
-NFR of the Ctpct
gene, 50 bp downstream of the 3
-terminus of exon 9 (Fig.
4C).
The structure of the murine Ctpct gene reveals several
interesting points. First, the organization of the exons has a distinct relationship to the functional domains of the CT protein (Fig. 1). Exon
2 encodes the first 39 amino acid residues of CT, which includes a
signal sequence (residues 8-28) that targets CT to the nucleus (21),
where the enzyme is localized in certain cell types (22). Exon 3 encodes residues 40 through 72, for which no specific function has been
reported. Exons 4-7 code for the catalytic domain (residues 73-236)
(3); exon 4 contains the codons for a HSGH motif (residues 89-92),
which is thought to mediate binding of CTP by the enzyme (23). Exon 8 encodes amino acids 237-299, which contain a 58-residue -helix
containing three contiguous 11-residue repeats (residues 256-288);
this
-helix is thought to play an important role in the
membrane-binding properties and enzymatic activity of CT (5, 24, 25).
This exon also encodes a densely positive-charged region, a cluster of
five lysine residues within a 7amino acid stretch (residues
248-254), the function of which is unknown. Exon 9 codes for the
C-terminal part of the CT protein (residues 300-367), which includes a
second
-helix (shown not to be necessary for membrane binding (26)) and multiple serine residues that become phosphorylated in
vivo (27, 28); phosphorylation of these serines may interfere with the binding of CT to membranes (29, 30) and with the activation of CT
by lipid activators, such as PC/oleic acid liposomes (31). Interestingly, the catalytic domain of mammalian CT is highly homologous to yeast CT while the nuclear localization signal, membrane-binding, and phosphorylation domains of mammalian CT are not
(24, 32). Thus, it appears as if the exons encoding the basic catalytic
unit of CT evolved first and were later embellished with additional
exon cassettes resulting in more complex post-translational regulatory
control.
Other interesting features of the murine Ctpct gene include
the use of two transcriptional initiation sites, the presence of an
untranslated exon 1 that is approximately 6 kb upstream from the
initiation codon in exon 2, and the large size of the gene. Other genes
involved in lipid biosynthesis, transfer, and metabolism also have an
untranslated first exon, including the genes encoding
phosphatidylethanolamine transferase-2 (PEMT2) (33), apolipoproteins
A-I, A-II, C-II, C-III, and E (34), and phospholipid transfer protein
(35). The gene for PEMT2, a 199-amino acid integral membrane protein
that catalyzes the synthesis of PC in hepatocytes, has three other
features in common with the Ctpct gene, namely two
transcriptional initiation sites, a very large size (~30-fold larger
than its cDNA versus ~20-fold larger for the
Ctpct gene), and the absence of a TATA or CAAT box in the
5-upstream region of the gene (see below) (33). Whether these
similarities denote common transcriptional regulatory features between
the two PC biosynthetic genes must await further studies on both
genes.
In this regard, the cloning of the Ctpct gene will hopefully
lead to future studies directed at understanding how transcription of
the CT gene is regulated. The 5-upstream sequence revealed no TATA or
CAAT box, but this region is rich in G + C (71% in the first 350 upstream nucleotides) and contains five GC boxes corresponding to
consensus Sp1-binding sites (36) (Fig. 3). Sp1-binding sites have been
shown to be present in promoters of numerous viral and cellular genes
and generally located at 40-100 nucleotides upstream of the
transcriptional initiation sites (36); in the putative Ctpct
promoter region, three potential Sp1-binding sites (
58,
66, and
71) are present in this location. Lack of a TATA or CAAT box,
multiple origins of transcription, and GC-rich Sp1-binding sites are
often found together and are typical of housekeeping genes (37-41).
Other consensus transcriptional factor-binding sites found in the
5
-upstream region of the Ctpct gene include those for AP1,
AP2, AP3, TFIIIA, and Y1 (Fig. 3); these sites and those for Sp1 are
concentrated in two areas of the putative promoter region, namely
between nucleotides
14 and
140 and between
310 and
392 (Fig.
3). Future studies will determine whether Sp1 and these other factors
are involved in basal transcription of the gene or in transcriptional
regulation, such as might occur during growth factor stimulation of
certain cells (7), after partial hepatectomy in rat liver (9), and
after overexpression of PEMT2 in hepatoma cells (8).
The major impetus for our laboratory to clone the Ctpct gene was related to our interest in the role of the CT enzyme and PC biosynthesis during atherogenesis (10-12) and our plans to study this relationship in vivo using induced mutant mice. Cloning of the gene was necessary for future gene targeting to create induced mutant mouse models of altered arterial wall PC biosynthesis. Additional induced mutant mouse models using CT constructs mutated in regions thought to be important in post-translational regulation (e.g. phosphorylation or membrane-binding domains) and in the consensus cis-acting sequences of the putative promoter region of the Ctpct gene will also be useful in understanding the regulation of CT in vivo.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U84199[GenBank], U84200[GenBank], U84201[GenBank], U84202[GenBank], U84203[GenBank], U84204[GenBank], U84205[GenBank], U84206[GenBank], and U84207[GenBank].
The authors wish to thank Dr. Rosemary B. Cornell (Simon Fraser University) for the rat CT cDNA, and Drs. Jeanine D'Armiento and Xian Cheng Jiang for helpful discussions.