©1996 by The American Society for Biochemistry and Molecular Biology, Inc.
Transcriptional Regulation of the Gene Coding for Human Protein C (*)

(Received for publication, December 14, 1995; and in revised form, February 14, 1996)

Carol H. Miao Wan-Ting Ho Daniel L. Greenberg Earl W. Davie (§)

From the Department of Biochemistry, University of Washington, Seattle, Washington 98195-7350

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

The promoter for the gene coding for human protein C has been characterized as to nucleotide sequences that regulate the synthesis of mRNA. The major transcription start site was found 65 nucleotides upstream from the first intron/exon boundary along with two minor sites. Functional characterization of 1528 base pairs at the 5`-end of the gene was then carried out by chloramphenicol acetyltransferase reporter assays, protection from DNase I digestion, and electrophoretic mobility shift assays employing HepG2 and HeLa cells. One of the upstream regions (nucleotides -25 to +9) contained binding sites for at least two different transcription factors, including a hepatic nuclear factor 1-binding site (-10 to +9) and two overlapping and oppositely oriented hepatic nuclear factor 3-binding sites (-25 to -11). A second major region (PCE1) (+12 to +30) appeared to be a unique, liver-specific regulatory sequence. An Sp1-binding site in exon I (+58 to +65) was also recognized by cotransfection experiments with an Sp1 expression plasmid. Specific mutations in these promoter elements reduced transcriptional activity and abolished the binding of hepatic nuclear proteins. Finally, a strong silencer element (PCS1) (between -162 and -82) and two possible liver-specific enhancer regions (PCE2 and PCE3), which interact coordinately with the promoter elements, were also found (between -1462 and -162).


INTRODUCTION

Protein C is a vitamin K-dependent zymogen of a serine protease that is present in plasma(1, 2) . The active form, called activated protein C, can regulate the blood coagulation cascade by minor proteolysis by the inactivation of activated factors V and VIII (3) . Protein C is synthesized in hepatocytes as a single chain precursor, which undergoes processing steps to give rise to a two-chain molecule held together by a disulfide bond. Additional post-translational modifications include carboxylation of 12 amino-terminal glutamic acid residues(4) , hydroxylation of an aspartic acid residue(5, 6) , and glycosylation of several amino acid residues (7) . The two-chain form is converted to activated protein C by thrombin in the presence of thrombomodulin by the cleavage of a 12-residue peptide from the amino terminus of the heavy chain(2, 8) . Protein C together with protein S, its cofactor, antithrombin III, and tissue factor pathway inhibitor represent major independent pathways for the regulation of blood coagulation. A deficiency of protein C constitutes a risk factor for venous thrombosis as well as other thrombotic disorders(9, 10) .

A large number of mutations have been found in the genes from patients with protein C deficiency, including several in the 5`-flanking region of the gene. Recently, activated protein C resistance with a factor V Leyden mutation has been identified as a highly occurring risk factor for thrombotic disease(11) . However, individuals with a single genetic defect, such as protein C deficiency or activated protein C resistance, can be asymptomatic. Combined genetic defects often lead to a much higher thrombotic risk and support the concept that hereditary thrombophilia is often a multigenic disease(12, 13) .

The gene for protein C is 11 kb (^1)in length and contains nine exons(14, 15) . It is located on chromosome 2q13-q14. The gene shares significant organizational similarity with the genes coding for the other vitamin K-dependent proteins that circulate in blood. However, significant differences in the steady-state mRNA levels in liver and in the concentrations of these proteins in plasma occur.

A comparison of the 5`-flanking sequences of the protein C gene with those of the genes coding for the other vitamin K-dependent coagulation proteins indicates a significant DNA sequence divergence. Nevertheless, transcriptional regulation of these genes has certain common features. In this investigation, a number of positive elements as well as a negative element are identified that regulate the gene coding for human protein C. The data demonstrate that transcriptional activity of the TATA-less protein C promoter is largely dependent upon sequences surrounding the transcription initiation sites. Three liver-specific promoter regions are identified, including contiguous binding sites for HNF1 and HNF3 and a unique regulatory element (designated PCE1) present in exon I. Regions responsible for positive and negative regulations in the upstream enhancer region are also defined.


EXPERIMENTAL PROCEDURES

Sequencing of the 5`-Flanking Region of the Protein C Gene

A phage clone (pC6) was isolated previously by screening a human genomic DNA library with a radiolabeled cDNA probe for human protein C(14) . A 4.4-kb EcoRI fragment containing the 5`-flanking region of the protein C gene was removed from phage pC6 and inserted into the XbaI site of pCAT-0 at the 5`-end of the CAT gene to yield plasmid pPC-4.4kb. The 5`-end sequence of the protein C gene was completely sequenced on both strands using synthetic primers employing the dideoxy terminator method of Sanger et al.(16) .

Rapid Amplification of 5`-cDNA Ends

To analyze the 5`-end of the protein C gene, the liver 5`-RACE-Ready cDNA (CLONTECH) was amplified with the anchor primer (5`-CTGGTTCGGCCCACCTCTGAAGGTTCCAGAATCGATAG-3`) and two nested antisense primers (RPC2, 5`-ACGTAGCTGCCGTAGCCGTCGAAGTCGACG-3`; and RPC7, 5`-ACCGTCGACGTGCTTGGACCAGAAGGCCAG-3`) designed from the protein C cDNA sequence. Human liver 5`-RACE-Ready cDNA prepared from a 33-year-old Caucasian female donor is an uncloned cDNA library with a unique single-stranded anchor oligonucleotide (5`-CACGAATTCACTATCGATTCTGGAACCTTCAGAGG-NH(3)-3`) attached to the 3`-ends of the cDNAs. The PCR products were purified and cloned into pCRII vector (Invitrogen) for sequencing of the 5`-cDNA ends.

Construction of Plasmids

Plasmid pCAT-0 was purchased from Promega, and plasmid pSVbeta was from CLONTECH. Plasmid pSV2-CAT was constructed as described previously(17) . A 1482-bp fragment containing 1462 bp upstream and 20 bp downstream of the transcription initiation site of the protein C gene was obtained by amplification of pC6 using the polymerase chain reaction technique. The resulting DNA fragment was cloned into the XbaI site of pCAT-0, generating the pPC-1482 construct. A 107-bp StuI-XbaI fragment containing the exon I sequence of the protein C gene was obtained by amplification of pC6 using the PCR technique and was inserted into the StuI and XbaI sites of pPC-1482 to yield the pPC-1528 construct. Deletion constructs from plasmids pPC-1.5kb and pPCE-1.5kb were obtained by the exonuclease III unidirectional deletion method, by digestion of convenient restriction sites, or by PCR techniques followed by ligation reactions. The sequences of these constructs were completely verified by dideoxy sequencing.

Mutations in the promoter-binding sites were generated in plasmids pPC-1482 and pPC-1528 using oligonucleotide-directed mutagenesis and the polymerase chain reaction technique, respectively(18) . Overlapping oligonucleotides with mutations were used as primers, and pPC-1482 and pPC-1528 were used as templates. The oligonucleotides used for sequencing primers, PCR primers, and EMSAs were synthesized on an Applied Biosystems Model 380B DNA synthesizer.

Cell Culture and Transfections

Human hepatoma cells (HepG2) were cultured in Ham's F-12 medium supplemented with L-glutamine, antibiotics (penicillin, streptomycin, and neomycin), and 5% fetal calf serum. HeLa cells were cultured in minimal Eagle's medium supplemented with L-glutamine, antibiotics, 1% nonessential amino acid, 1% sodium pyruvate, and 10% fetal calf serum. Both cell lines were maintained in a 5% CO(2) atmosphere at 37 °C. Plasmid DNA (5 µg) and pSVbeta (5 µg) were used as an internal control and cotransfected into cultured cells by the calcium phosphate precipitation technique(19) . Each transfection was repeated at least three times.

CAT and beta-Galactosidase Assays

CAT activity was measured by the method of Gorman et al.(20, 21) , and beta-galactosidase activity by the method of Herbomel et al.(22) . CAT activities were normalized to beta-galactosidase activities to correct for differences in transfection efficiency and cell concentrations.

DNase I Footprint Assays

DNase I footprint assays were performed in a 30-µl reaction volume containing 20 mM Hepes, pH 7.9, 25 mM KCl, 4 mM MgCl(2), 4 mM spermidine, 0.5 mM EDTA, 12 ng of pUC18 DNA, and 3 µg of poly(dI-dC). Crude HepG2 nuclear extracts (containing 60 µg of total protein) were added and incubated for 10 min at room temperature. End-labeled DNA fragments (1-2 ng) were added, and incubation was continued for an additional 10 min at room temperature. The reaction mixtures were then digested with freshly diluted DNase I for 2 min. Digestion was stopped by adding 70 µl of stop solution (20 mM Tris-HCl, pH 7.5, 20 mM EDTA). The DNA was then extracted with phenol/chloroform, precipitated with ethanol, and analyzed on 6% polyacrylamide gels containing 7 M urea.

EMSAs

Nuclear protein extracts prepared from HepG2 cells were preincubated in a 20-µl reaction containing 20 mM Tris, pH 7.6, 8% Ficoll, 25 mM KCl, 4 mM MgCl(2), 4 mM spermidine, 1 mM EDTA, 0.5 mM dithiothreitol, 3 µg of poly(dI-dC), and 50 ng of salmon sperm DNA. Competitor DNA was added as needed. After 10 min, the P-end-labeled duplex oligonucleotide (20,000 cpm) was added, and the reaction was incubated for 10 min at room temperature. Samples were then analyzed on a 5% polyacrylamide gel in 0.25 times TBE buffer (11.25 mM Tris borate, pH 8.4, 0.25 mM EDTA).


RESULTS

Sequence of the 5`-End of the Gene Coding for Human Protein C

More than 1500 bp from the 5`-noncoding region of the gene coding for human protein C was isolated from phage pC6 (14) and sequenced (Fig. 1). The sequence from -617 to +66 was identical to that previously reported by Foster et al.(14) , while the additional sequence from -1462 to -618 was established in the present study. The bases were numbered relative to the major transcription initiation site (see below).


Figure 1: Sequence of the 5`-end of the gene for human protein C. Bases are numbered relative to the major transcription start site (+1), marked with an asterisk. Two minor start sites are marked with double asterisks. Deletion constructs used in reporter gene assays were as follows. pPC-1482 contained protein C sequences from -1462 to +20 (); pPC-1528 contained sequences from -1462 to +66 (), and pPC-n-66 contained sequences from -n to +66. Exons are underlined. The translation start codon (ATG) is shown in boldface.



Transcription Initiation Site

Since the cellular level of the protein C mRNA is very low, rapid amplification of the 5`-cDNA ends was performed to identify the transcription start site(s). Two antisense primers (RPC2 and RPC7) designed from the protein C cDNA sequence and an anchor primer to the oligonucleotide-anchored 3`-ends of the cDNA library were used to amplify the 5`-region of the protein C mRNA employing a human liver cDNA library. PCR products were analyzed by 2% agarose gel electrophoresis, and the amplification product, which appeared as a band of 400 bp, was isolated and cloned for sequencing. Sequencing of the resulting transcripts revealed several transcription initiation sites with 80% of the transcripts starting at an A located 65 nucleotides upstream from the first intron/exon boundary or 1515 bp upstream from the translation start codon (AUG). This A was designated as +1 in the DNA sequence shown in Fig. 1. Another 18% of the transcripts started at -7 and 2% at +13. The +13 initiation site corresponds to the site assigned previously(15) . Also, a slightly different splice site was observed in the first intron/exon II junction (+1497, +1498, AG) compared with that reported earlier (+1493, +1494, AG)(15) .

Transcriptional Regulation of the Human Protein C Gene

To characterize sequences responsible for transcriptional regulation from the 5`-end of the gene, a 1482-bp segment including 20 bp from exon I (-1462 to +20) and a 1528-bp segment including 66 bp from exon I (-1462 to +66) were linked to a promoterless CAT reporter gene in plasmid pCAT-0. The resulting constructs, pPC-1482 and pPC-1528, were transfected into human hepatoma HepG2 and HeLa cells, and transient reporter gene expression was monitored by measuring CAT activity in the cell extracts. To correct for differences in cell number and DNA transfection efficiencies, the cells were cotransfected with a reference plasmid carrying the beta-galactosidase gene under the control of the SV40 early promoter-enhancer. After transfection, CAT activities were measured and normalized to beta-galactosidase activity. When plasmid pPC-1482, containing the 5`-flanking region of the protein C gene from -1462 to +20, was transfected into HepG2 cells, low but detectable levels of CAT activity were produced in HepG2 cells. Plasmid pPC-1528, containing the 5`-flanking region from -1642 to +66, resulted in CAT activity 10-fold higher than that obtained with pPC-1482. Only background levels of activity were observed with both plasmids in HeLa cells. These results indicate that the 5`-flanking sequence with the 20-bp exon I sequence is sufficient to direct basal liver-specific gene expression, but the inclusion of the entire exon I sequence (66 bp) results in high level and liver-specific expression of the gene.

A series of deletion constructs were then generated from plasmid pPC-1528 (Fig. 2A) and tested for activity in HepG2 and HeLa cells (Fig. 2B). A deletion from -1462 to -723 resulted in a 30% reduction in activity in HepG2 cells, and a further deletion to -162 caused another 30% reduction in activity. This suggested the presence of at least two enhancer regions (PCE3 and PCE2) between -1462 and -162 in the promoter. These reductions in activity, resulting from the deletions from -1462 to -162, were not observed in the absence of the full exon I sequence (data not shown). This suggests that the function of these enhancer elements depends upon the initial assembly of the initiation complex on the protein C promoter. Further stepwise deletions from -162 to -82 resulted in an increase of 4-fold in reporter gene activity, indicating the presence of a strong silencer element (PCS1) in this region. Deletion of the sequence from -82 to -42 resulted in a small but reproducible decrease in activity. Finally, a precipitous reduction in expression occurred upon deletion of the sequence from -42 to +66 (PCE1). These experiments indicate that one or more promoter elements are located from -42 to +66 and are functionally responsible for high efficiency transcription. This region also contains an Sp1 consensus sequence (+58 to +65) that may play a role in this activity (see below).


Figure 2: Transient expression of CAT activities by deletion constructs transfected into HepG2 and HeLa cells. A, a series of PC-CAT fusion constructs containing varying lengths of the protein C 5`-end sequences were transfected into HepG2 and HeLa cells. B, shown are CAT activities expressed by deletion constructs. CAT activity of pPC-1528 was arbitrarily defined as 100% in HepG2 cells and used as a reference to normalize the CAT activity data of other constructs.



Deletion constructs were also transfected into HeLa cells to determine further which region(s) was responsible for directing the liver-specific expression of the protein C gene. Plasmid pPC-42-66, containing the region from -42 to +66, exhibited much higher CAT expression than the promoterless pCAT-0 plasmid in HepG2 cells (Fig. 2B). In contrast, little increase was observed in HeLa cells. These data indicate that the region from -42 to +66 contains strong liver-specific elements as well as other regulatory elements. Small increments with plasmid pPC-82-66 and a decrease in CAT activity with plasmid pPC-162-66 were observed in HepG2 and HeLa cells. Furthermore, the exon I-dependent enhancer activity in the region from -1462 to -162 was not observed in HeLa cells, suggesting that the enhancer elements are liver-specific regulatory sequences. Further investigation is needed to define the function of these enhancer elements.

DNase I Footprint Analyses

Nuclear proteins from HepG2 cells protected two distinct sequence areas in the promoter region from -42 to +58 (Fig. 3, A and B). Sense strand footprints were designated FP-I (+12 to +30) and FP-II (-25 to +9), as shown in Fig. 3C. Footprints on the antisense strand were very similar and showed only very minor differences. These regions were not protected by the HeLa cell nuclear extract, as shown in Fig. 3A (lane 5). Comparison of the sequences of the FP-II region with known consensus binding sites revealed that this region contained sequences characteristic of binding sites for several liver-specific transcription factors. The downstream half-sequence of the FP-II region (-10 to +9) was homologous to an HNF1 recognition site(23) , whereas the upstream half-sequence (-25 to -11) included overlapping HNF3-binding sites homologous to two oppositely oriented TGTTT motifs(24) , shown in Table 1. Four point mutations have been reported in this region in patients with low protein C levels. These occur at -20 (A G), -15 (T A), +3 (C T)(25, 26) , and -2 (T C)(27) . The mutations at +3 and -2 occur in the HNF1 site, and those at -15 and -20 in the HNF3 sites (Fig. 3C). The FP-I region, from +12 to +30, designated PCE1, was located within exon I, immediately downstream from the initiation sites. No homology has yet been found for this sequence when compared with recognition sequences for other known transcription factors.


Figure 3: DNase I footprint analyses. A, sense strand of the protein C promoter region. A DNA fragment containing the protein C promoter (from -42 to +58) was labeled at the 3`-end of the sense strand and was subjected to DNase I digestion in the absence (lane 2) and presence of HepG2 nuclear extracts (lanes 3 and 4) and HeLa nuclear extracts (lane 5). A purine-specific sequence marker (G + A; lane 1) was obtained by Maxam-Gilbert sequencing of the end-labeled fragment(55) . B, antisense strand. Lane 1, G + A sequence marker; lane 2, without nuclear extracts; lane 3, with HepG2 nuclear extracts. Brackets indicate regions that are protected from DNase I digestion. C, sequence of the protected regions identified as FP-I and FP-II. Naturally occurring mutations are indicated (), as are transcription start sites (*).





To evaluate the effect of the mutations on transcription of the gene in the regions identified by DNase I footprinting, the pPC-1528 construct was mutated and transfected into HepG2 cells. A mutation in the HNF1-binding site at +3 (C T) or -2 (T C) reduced the reporter gene activity by 90 and 85%, respectively, while a mutation in the HNF3 site at - 20 (A G) reduced the activity by 82%. Finally, mutations in the PCE1 site at +23, 24 (GG AA) reduced the activity by 84% compared with the wild-type promoter. These results demonstrate that specific mutations in these protein-binding sites for HNF1, HNF3, and PCE1 greatly impair the promoter for the protein C gene.

Characterization of the Promoter by EMSA

Synthetic oligonucleotides used in EMSAs (Table 1) were designed for the regulatory sites in the proximal promoter region for the normal protein C gene as well as for known mutations in genes of patients with coagulation disorders. With the HepG2 nuclear extract, all three wild-type double-stranded oligonucleotides (PCP1, PCP2, and PCE1; shown in Table 1) designed from footprinted areas bound nuclear proteins, which were visualized as bands with retarded mobility (lanes 1 in Fig. 4, panels A-C). These DNA-protein complexes were competed and eliminated by a 20- or 200-fold molar excess of the corresponding unlabeled binding site oligonucleotide (lanes 2 and 3 in Fig. 4, panels A-C), indicating that the interactions of nuclear proteins with these binding sites are sequence-specific. When these sequences were tested with HeLa or fibroblast nuclear extract, little if any protein binding occurred (data not shown). These results, together with the footprint and CAT reporter functional assays, indicate that proteins bound to these regulatory sequences are liver-specific transcription factors.


Figure 4: EMSAs of protein binding at the protein C promoter region. End-labeled duplex oligonucleotides were each incubated with crude HepG2 nuclear extracts and analyzed by EMSAs. For competition, unlabeled duplex oligonucleotides in 20- or 200-fold molar excesses over labeled oligonucleotides were added to the reaction mixture 10 min before adding the labeled probe. F indicates the position of free oligonucleotide, and B indicates the position of the retarded DNA-protein complexes. A, EMSA of the P-labeled PC(HNF1) oligonucleotide (lane 1) and competition with 20- and 200-fold molar excesses of unlabeled PC(HNF1) (lanes 2 and 3), PC(HNF1,m1) (lanes 4 and 5), PC(HNF1,m2) (lanes 6 and 7), and HNF1 (lanes 8 and 9) oligonucleotides; EMSAs of the P-labeled PC(HNF1,m1) (lane 10) and PC(HNF1,m2) (lane 11) oligonucleotides; and EMSA of the P-labeled HNF1 oligonucleotide (lane 12) and competition with a 200-fold excess of unlabeled HNF1 (lane 13) and PC(HNF1) (lane 14) oligonucleotides. B, EMSA of the P-labeled PC(HNF3) oligonucleotide (lane 1) and competition with 20- and 200-fold excesses of unlabeled PC(HNF3) (lanes 2 and 3), PC(HNF3,m1) (lanes 4 and 5), and HNF3 (lanes 6 and 7) oligonucleotides and EMSA of the P-labeled PC(HNF3,m1) oligonucleotide (lane 8). C, EMSA of the PCE1 oligonucleotide (lane 1) and competition with 20- and 200-fold excesses of oligonucleotides designed from binding sites for different liver-specific transcription factors: PCE1 (lanes 2 and 3), HNF1 (lanes 4 and 5), HNF3 (lanes 6 and 7), HNF4 (lanes 8 and 9), HNF5 (lanes 10 and 11), and C/EBP (lanes 12 and 13). D, EMSAs of the P-labeled PCE1 oligonucleotide (lane 1) and competition with 20- and 200-fold excesses of unlabeled PCE1 (lanes 2 and 3) and PCE1,m1 (lanes 4 and 5) oligonucleotides and EMSA of the P-labeled PCE1,m1 oligonucleotide (lane 6). E, EMSA of the P-labeled PC(Sp1) oligonucleotide (lane 1) and competition with 20- and 200-fold excesses of unlabeled PC(Sp1) (lanes 2 and 3), PC(Sp1,m1) (lanes 4 and 5), and Sp1 (lanes 6 and 7) oligonucleotides and EMSA of the P-labeled PC(Sp1,m1) oligonucleotide (lane 8).



As shown in Fig. 4A, DNA-protein complex formation by the PC(HNF1) oligonucleotide and HepG2 nuclear proteins was not influenced by the addition of mutated oligonucleotides (PC(HNF1,m1) and PC(HNF1,m2)) (lanes 4-7), but was competed and abolished by 20- and 200-fold molar excesses of unlabeled HNF1 consensus oligonucleotide (lanes 8 and 9). Furthermore, P-labeled PC(HNF1) sequences that were mutated (PC(HNF1,m1) and PC(HNF1,m2)) were also unable to bind hepatic nuclear proteins (Fig. 4A, lanes 10 and 11). Finally, the retarded bands formed by the oligonucleotide containing the HNF1 consensus sequence and HepG2 nuclear proteins were competed and eliminated completely by a 200-fold molar excess of HNF1 and PC(HNF1) oligonucleotides (Fig. 4A, lanes 12-14). These results clearly indicate that a nuclear protein(s) binds to an HNF1 site in the promoter of protein C and that a single base mutation at +3 (C T) or -2 (T C) abolishes this binding.

The PC(HNF3) oligonucleotide was also bound to HepG2 nuclear protein, but with low affinity. However, this DNA-protein complex was competed and eliminated by a 20- or 200-fold molar excess of unlabeled PC(HNF3) oligonucleotide (Fig. 4B, lanes 6 and 7). Unlabeled mutated PCP2 oligonucleotide (PC(HNF3)) was unable to compete and abolish DNA-protein complex formation (Fig. 4B, lanes 4 and 5). Also, the P-labeled PC(HNF3,m1) oligonucleotide was unable to bind hepatic nuclear proteins (Fig. 4B, lane 8). These results demonstrate that this site is an HNF3-binding site and that a single base mutation at -20 (A G) can abolish its binding to hepatic proteins.

As shown in Fig. 4C, two retarded bands were formed when the PCE1 oligonucleotide designed from the FP-I region was incubated with HepG2 nuclear extract. Oligonucleotides designed from consensus sequences for the most abundant known hepatic transcription factors, including HNF1(23) , HNF3 and HNF4(28) , HNF5(29) , and C/EBP(30) , were unable to compete and eliminate the retarded complexes formed by the PCE1 oligonucleotide (Fig. 4C, lanes 4-13), indicating that this element is a unique and specific sequence. Also, the unlabeled mutated sequence (PCE1,m1) was unable to compete and eliminate the DNA-protein complexes (Fig. 4D, lanes 4 and 5). The P-labeled PCE1,m1 oligonucleotide also failed to bind hepatic nuclear proteins (Fig. 4D, lane 6). These two base mutations were located at +23 and +24 (GG AA).

Characterization of a Potential Sp1 Site

Sequences spanning from +58 to +65 (CCCGCCCC) contain an Sp1 consensus sequence (31) that has been examined by reporter gene assays. The pPC-1528 construct was mutated in the Sp1-binding site at +63 (C T), transfected into HepG2 cells, and compared with pPC-1528. The single base substitution employed in these experiments occurs in the proximal Sp1 site of the human low density lipoprotein receptor promoter, resulting in a heterozygous familial hypercholesterolemia(32) . The mutation in the protein C promoter reduced the activity of pPC-1528 by only 20%. Cotransfection of pPC-42-66 containing the Sp1-binding site with an Sp1 expression plasmid (33) resulted in a 1.6-fold increase in activity, while no effect was observed with the pPC-42-34 construct lacking the Sp1 site. The small increase may be due in part to the relatively high level of Sp1 already present in HepG2 cells. The potential Sp1 site was further examined by EMSAs. An oligonucleotide from this region (+53 to +70; PC(Sp1)) bound to HepG2 nuclear proteins and formed retarded bands (Fig. 4E, lane 1). These bands were competed and abolished by a 20- or 200-fold excess of unlabeled PC(Sp1) or Sp1 consensus oligonucleotide, respectively (Fig. 4E, lanes 2, 3, 6, and 7). Unlabeled mutated oligonucleotide (PC(Sp1,m1)) was unable to compete and abolish DNA-protein complex formation (Fig. 4E, lanes 4 and 5). Also, the P-labeled PC(Sp1,m1) oligonucleotide was unable to bind hepatic nuclear proteins (Fig. 4E, lane 8). These data are consistent with the proposal that the nucleotide sequence in the region from +53 to +70 may contain an active Sp1-binding site and that a single base mutation at +63 (C T) inhibits its binding to hepatic proteins.


DISCUSSION

This study has demonstrated that 1.5 kb of DNA from the 5`-flanking region and 66 bp from the noncoding exon I sequences of the protein C gene contain sufficient information for high level expression of the gene in HepG2 cells. The data also indicate that the protein C promoter consists of at least three liver-specific regulatory elements and one general element that drives the high level, liver-specific expression of the gene. These elements include HNF1, HNF3, and PCE1 as well as a potential Sp1-binding site, all of which are located in the region surrounding the transcription initiation site.

HNF1alpha, a homeodomain transcription factor, has been reported to be a major transactivator of numerous liver-specific genes and is also an activator of the protein C promoter(27) . Cotransfection with HNF1alpha induced a 1.5-fold transactivation in the wild-type promoter and a 0.8-fold transactivation in a mutated promoter. These data are consistent with the present experiments showing that the HNF1-binding site is important for basal level transcription of the gene. Whether other factors of the HNF1 family participate in the transactivation of the protein C gene is not yet clear.

The binding affinity of hepatic nuclear protein(s) for the two HNF3 sites in the protein C promoter was quite low. Recently, it has been shown that cotransfection experiments with an HNF3 expression plasmid and the wild-type protein C promoter resulted in a 4-5-fold increase in promoter activity in HepG2 cells(34) . HNF3-binding sites have been identified as essential cis-acting elements in the promoters and enhancers of several liver-specific genes. However, the transactivation by HNF3 of an HNF3-dependent minimal promoter was relatively low since it did not exceed 4-5-fold. In contrast, HNF1-dependent promoters show a >100-fold increase when excess HNF1 is present. Several laboratories have reported that an important role of HNF3 could be to cooperate with other factors bound to contiguous DNA elements, such as the glucocorticoid-responsive enhancer(29) , the nuclear factor 1 element (24) , the HNF4/ARP1/COUP-TF family-binding site(35) , and the HNF1 element(36) . The low binding of hepatic protein to the HNF3 site in the protein C gene may also be due to the absence of accessory proteins or sequences. Another proposed role for HNF3 involves the transition of chromatin from an inactive to an active conformation(37) . In the case of the gene for protein C, binding of HNF3 to the PC(HNF3) sequence may contribute to opening the chromatin at or near the protein C promoter, therefore making it available for subsequent HNF1 binding and transcriptional activation.

Deletion analysis from the 3`-end revealed that the PCE1 site, a unique and liver-specific regulatory sequence, was the principal element for high efficiency transcription. Mutations at +23 and +24 (GG AA) in this element abolished its binding to hepatic proteins and greatly decreased its transcriptional activity. Mutational analysis, cotransfection experiments, and EMSAs also showed a potential Sp1-binding site in exon I downstream from the PCE1 element. Sp1-binding sites have been demonstrated as an important regulatory element at transcription initiation sites for many TATA-less promoters, including the gene for factor VII(38) . It is believed that the preinitiation complex is assembled around the multiple initiation sites directed by the tightly clustered regulatory elements in the proximal promoter region of the protein C gene. Any disruption in the promoter sequences surrounding the transcription initiation sites impairs the assembly of the preinitiation complex, causing a reduction in transcription efficiency. Furthermore, this promoter region is similar to the 80-bp enhancer region described for the prothrombin gene in which the HNF1-binding site is flanked on the 3`-side by Sp1 sequences (Fig. 5).


Figure 5: Schematic comparison of the known transcription regulatory sites present in the genes coding for the human vitamin K-dependent coagulation proteins. Red inverted triangles correspond to silencer or repressor elements. Upright triangles correspond to positive regulatory elements. Black triangles indicate elements with no known homologous sequences. Other elements are labeled according to their corresponding transcription factors. FVII, factor VII; FIX, factor IX; FX, factor X; PC, protein C; NF-1, nuclear factor 1.



cis-Acting elements located upstream from the promoter can also modulate the promoter activity. The upstream -162/-82 fragment decreases the activities of the strongly active pPC-82-66. This reduction, observed in HepG2 cells as well as in HeLa cells, may be due to a silencer element interacting with ubiquitous factors or to other effects, such as steric hindrances exerted on promoter elements. A possible HNF4-like element from -131 to -116 is also located in this region. Polymorphism in this region has been described to affect plasma protein C levels in the population(39) . Further work is needed to elucidate the role of negative regulation in protein C gene expression.

One particular feature of the protein C gene among the vitamin K-dependent genes is that it contains an additional short noncoding exon I sequence upstream from the translation start codon (AUG), separated by a 1463-bp intron sequence. The gene coding for the only other vitamin K-dependent anticoagulant factor, protein S, has also been postulated to contain an additional noncoding exon I sequence since two transcripts have been observed(40) . The participation of intron sequences in regulating protein C gene expression is currently under investigation.

It is common in many TATA-less promoters that transcription initiates from a cluster of sites surrounding +1. Each of the initiation sites found in the protein C gene were surrounded by pyrimidine-rich sequences as characterized by most initiation sites of other genes. The +1 and -7 sites were present in the HNF1-binding site, whereas the +12 site was located in the PCE1 site (Fig. 3C). Several promoters, generally but not necessarily lacking the TATA-box, have an initiator element that can replace or reinforce the role of the TATA sequence in directing the location of a transcription start site. These initiator elements have recently been grouped into families based upon sequence homology(41) . Sequences surrounding the multiple initiation sites of the protein C gene, however, were not homologous to any other known initiator sequence(s). It is unclear at this point whether an unidentified initiator element in the protein C gene or the clustered regulatory elements initiate the assembly of the transcription apparatus. This is very similar to the factor IX gene, where the promoter is characterized by a tight cluster of regulatory elements surrounding the transcription initiation site(42) . Mutations found to date in the 5`-flanking region of the factor IX gene in patients with factor IX deficiency (hemophilia B Leyden phenotype) were all located in this tight cluster called the Leyden-specific region from -40 to +20 in the 5`-end sequence. This is also comparable to the protein C gene, in which naturally occurring mutations in the 5`-region were located from -20 to +3. Characterization of the protein C promoter led to the understanding of genetic disorders caused by known and possible additional mutations occurring in the 5`-region in patients with type I protein C deficiency.

In addition to protein C, there are six other vitamin K-dependent glycoproteins that circulate in blood, including factors VII, IX, and X, prothrombin, protein S, and protein Z. The genes for these vitamin K-dependent proteins share significant organizational similarity and have evolved from a common ancestral gene(43) . It is also noted that five of these genes (not including the protein S and protein Z genes, the regulation of which has not been studied) are regulated by ``TATA''-less promoters. Furthermore, transcriptional regulation of these genes shares certain common features (Fig. 5). The factor VII gene, which is located on chromosome 13q34-qter(44) , is regulated by two promoter elements, the FVIIP1 site containing an HNF4-binding site and the FVIIP2-binding site present in a GC-rich sequence that binds hepatic specific factors as well as the ubiquitous transcription factor Sp1(38) . In addition, two silencer elements were located upstream of the promoter region. The factor IX gene, which is located on chromosome Xq26-27, is regulated by the presence of liver-specific cis-acting elements that interact with the liver enriched transcription factors C/EBP and HNF4 (45, 46) and with the liver-specific transcription factors nuclear factor 1-like liver-specific protein and D-site-binding protein (DBP) (42) . The factor IX gene may be hormonally regulated since the deficiency in hemophilia B Leyden, which is caused by mutations in the Leyden-specific regulatory region of the factor IX gene, can be partially overcome following puberty or by the administration of testosterone(47) . In rat liver, DBP, which also recognizes some of the cis-elements as C/EBP, was not expressed until puberty(48) . Reporter gene studies and DNA-protein binding assays with factor IX promoter sequences containing hemophilia B Leyden mutations of the C/EBP-binding site suggest that DBP may enhance C/EBP binding and transcription of the factor IX gene due to a synergistic interaction between C/EBP and DBP(49) . Hence, the hormonal regulation of the factor IX gene is probably due to the induction of DBP expression during puberty rather than the presence of an androgen-responsive element in the factor IX gene. The factor X gene is located 2.8 kb downstream from the factor VII gene on chromosome 13q34-qter. The factor X gene is regulated by three positive regulatory regions (FXP1, FXP2, and FXP3 sites) and a negative element that blocks the transcriptional activity toward the upstream factor VII gene(17, 50) . Transfection in HepG2 and human fibroblast cells suggests that the FXP1 and FXP3 sites interact with liver-specific trans-activating factors, while the FXP2 site interacts with ubiquitous transcription factors. Furthermore, the FXP1 site contains a 22-bp sequence similar to the consensus recognition site for the liver-specific transcription factor HNF4. This HNF4-binding element in the FXP1 site has a 6-bp core sequence (CTTTGC) that is also present in the HNF4-binding element present in the factor VII and factor IX promoters. The prothrombin gene is located on chromosome 11p11-q12(51) . It contains a weak promoter immediately before the transcription initiation site and a liver-specific enhancer sequence located 860-940 nucleotides from the transcription initiation site. The latter region apparently interacts with HNF1 and is flanked on the 3`-side by GC-rich sequence that is similar to an Sp1-binding site and is essential for enhancer activity (52, 53, 54) . The 10-base pair GC-rich sequence shares 90% sequence identity with the Sp1-binding site present in the factor VII promoter. As shown in Fig. 5, there are a number of common regulatory units shared by the vitamin K-dependent proteins as well as unique sequences that regulate the individual proteins.


FOOTNOTES

*
This work was supported by Research Grant HL-16919 from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U47685[GenBank].

§
To whom correspondence should be addressed: Dept. of Biochemistry, University of Washington, P. O. Box 357350, Seattle, WA 98195-7350.

(^1)
The abbreviations used are: kb, kilobase(s); HNF, hepatic nuclear factor; CAT, chloramphenicol acetyltransferase; PCR, polymerase chain reaction; bp, base pair(s); EMSA, electrophoretic mobility shift assay; C/EBP, CCAAT/enhancer-binding protein; DBP, D-site-binding protein.


ACKNOWLEDGEMENTS

We thank Drs. Donald C. Foster and Dominic W. Chung for kindly providing the recombinant phage clone (pC6) and Dr. Robert Tjian for an Sp1 expression plasmid (pAdh-Sp1-in). We also thank Jeff Harris for preparing the oligonucleotides employed in this study and L. Boba for help in preparing the manuscript.


REFERENCES

  1. Stenflo, J. (1976) J. Biol. Chem. 251, 355-363 [Abstract]
  2. Kisiel, W., Ericsson, L. H. & Davie, E. W. (1976) Biochemistry 15, 4893-4900 [Medline] [Order article via Infotrieve]
  3. Kisiel, W., Canfield, W. M., Ericsson, L. H. & Davie, E. W. (1977) Biochemistry 16, 5824-5831 [Medline] [Order article via Infotrieve]
  4. DiScipio, R. G. & Davie, E. W. (1979) Biochemistry 18, 899-904 [Medline] [Order article via Infotrieve]
  5. Fernlund, P. & Stenflo, J. (1983) J. Biol. Chem. 258, 12509-12512 [Abstract/Free Full Text]
  6. McMullen, B. A., Fujikawa, K., Kisiel, W., Sasagawa, T., Howald, W. N., Kwa, E. Y. & Weinstein, B. (1983) Biochemistry 22, 2875-2884 [Medline] [Order article via Infotrieve]
  7. Jackson, R. C. & Blobel, G. (1980) Ann. N. Y. Acad. Sci. 343, 391-403 [Medline] [Order article via Infotrieve]
  8. Esmon, C. T. & Owen, W. G. (1981) Proc. Natl. Acad. Sci. U. S. A. 78, 2249-2252 [Abstract]
  9. Griffin, J. H., Evatt, B., Zimmerman, T. S., Kleiss, A. J. & Wideman, C. (1981) J. Clin. Invest. 68, 1370-1373 [Medline] [Order article via Infotrieve]
  10. Griffin, J. H., Mosher, D. F., Zimmerman, T. S. & Kleiss, A. J. (1982) Blood 60, 261-264 [Abstract]
  11. Dahlback, B. (1994) J. Clin. Invest. 94, 923-927 [Medline] [Order article via Infotrieve]
  12. Brenner, B., Zivelin, A., Lanir, N., Greengard, J., Griffin, J. H. & Seligsohn, U. (1995) Thromb. Haemostasis 73, 942 (abstr.)
  13. Mandel, H., Brenner, B., Berant, M., Rosenberg, N., Lanir, N., Jakobs, C., Fowler, B. & Seligsohn, U. (1995) Thromb. Haemostasis 73, 1361 (abstr.)
  14. Foster, D. C., Yoshitake, S. & Davie, E. W. (1985) Proc. Natl. Acad. Sci. U. S. A. 82, 4673-4677 [Abstract]
  15. Plutzsky, J., Hoskins, J. A., Long, G. L. & Crabtree, G. R. (1986) Proc. Natl. Acad. Sci. U. S. A. 83, 546-550 [Abstract]
  16. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74, 5463-5467 [Abstract]
  17. Miao, C. H., Leytus, S. P., Chung, D. W. & Davie, E. W. (1992) J. Biol. Chem. 267, 7395-7401 [Abstract/Free Full Text]
  18. Nelson, R. M. & Long, G. L. (1989) Anal. Biochem. 180, 147-151 [Medline] [Order article via Infotrieve]
  19. Graham, F. T. & van der Eb, A. T. (1973) Virology 52, 456-467 [Medline] [Order article via Infotrieve]
  20. Gorman, C. M., Moffat, L. F. & Howard, B. H. (1982) Mol. Cell. Biol. 2, 1044-1051 [Medline] [Order article via Infotrieve]
  21. Gorman, C. M. (1985) in DNA Cloning (Glover, D. M., ed) Vol. II, pp. 143-190, IRL Press, Washington, D. C.
  22. Herbomel, P., Bourachot, B. & Yaniv, M. (1984) Cell 39, 653-662 [Medline] [Order article via Infotrieve]
  23. Courtois, G., Morgan, J. G., Campbell, L. A., Fourel, G. & Crabtree, G. R. (1987) Science 238, 688-692 [Medline] [Order article via Infotrieve]
  24. Jackson, D. A., Rowader, K. E., Stevens, K., Jiang, C., Milos, P. & Zaret, K. S. (1993) Mol. Cell. Biol. 13, 2401-2410 [Abstract]
  25. Reitsma, P. H., Poort, S. R., Bernardi, F., Gandrille, S., Long, G. I., Sala, N. & Cooper, D. N. (1993) Thromb. Haemostasis 69, 77-84 [Medline] [Order article via Infotrieve]
  26. Tsay, W., Greengard, J. S., Montgomery, R. R., Mcpherson, R. A., Fucci, J. C., Koerper, M. A., Coughlin, J. & Griffin, J. H. (1993) Blood Coagul. & Fibrinolysis 4, 791-796
  27. Berg, L.-P., Scopes, D. A., Alhag, A., Kakkar, V. V. & Cooper, D. N. (1994) Hum. Mol. Genet. 3, 2147-2152 [Abstract]
  28. Costa, R. H., Grayson, D. R. & Darnell, J. E., Jr. (1989) Mol. Cell. Biol. 9, 1415-1425 [Medline] [Order article via Infotrieve]
  29. Grange, T., Roux, J., Rigaud, G. & Pictet, R. (1991) Nucleic Acids Res. 19, 131-139 [Abstract]
  30. Tronche, F., Rollier, A., Sourdive, D., Cereghini, S. & Yaniv, M. (1991) J. Mol. Biol. 222, 21-43 [Medline] [Order article via Infotrieve]
  31. Jackson, S. P., MacDonald, J. J., Lees-Miller, S. & Tjian, R. (1990) Cell 63, 155-165 [Medline] [Order article via Infotrieve]
  32. Koivisto, U.-M., Palvimo, J. J., Janne, O. A. & Kontula, K. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 10526-10530 [Abstract/Free Full Text]
  33. Courey, A. J. & Tjian, R. (1988) Cell 2, 887-898
  34. Spek, C. A., Greengard, J. S., Griffin, J. H., Bertina, R. M. & Reitsma, P. H. (1995) J. Biol. Chem. 270, 24216-24221 [Abstract/Free Full Text]
  35. Paulweber, B., Sandhofer, F. & Levy-Wilson, B. (1993) Mol. Cell. Biol. 13, 1534-1546 [Abstract]
  36. Gregori, C., Kahn, A. & Picard, A. L. (1994) Nucleic Acids Res. 22, 1242-1246 [Abstract]
  37. Mcpherson, C. L., Shim, E.-Y., Friedman, D. S. & Zaret, K. S. (1993) Cell 75, 387-398 [Medline] [Order article via Infotrieve]
  38. Greenberg, D., Miao, C. H., Ho, W.-T., Chung, D. W. & Davie, E. W. (1995) Proc. Natl. Acad. Sci. U. S. A. 92, 12347-12351 [Abstract]
  39. Spek, C. A., Koster, T., Rosendaal, F. R., Bertina, R. M. & Reitsma, P. H. (1995) Arterioscler. Thromb. Vasc. Biol. 15, 214-218 [Abstract/Free Full Text]
  40. van Amstel, H. K. P., Reitsma, P. H., van der Logt, C. P. E. & Bertina, R. M. (1990) Biochemistry 29, 7853-7861 [Medline] [Order article via Infotrieve]
  41. Weis, L. & Reinberg, D. (1992) FASEB J. 6, 3300-3309 [Abstract/Free Full Text]
  42. Kurachi, S., Furukawa, M., Salier, J.-P., Wu, C.-T., Wilson, E. J., French, F. S. & Kurachi, K. (1994) Biochemistry 33, 1580-1591 [Medline] [Order article via Infotrieve]
  43. Leytus, S. P., Foster, D. C., Kurachi, K. & Davie, E. W. (1986) Biochemistry 25, 5098-5102 [Medline] [Order article via Infotrieve]
  44. O'Hara, P. J., Grant, F. J., Haldeman, B. A., Gray, C. L., Insley, M. Y., Hagan, F. S. & Murray, M. J. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 5158-5162 [Abstract]
  45. Crossley, M., Ludwig, M., Stowell, K. M., De Vos, P., Olek, K. & Brownlee, G. G. (1992) Science 257, 377-379 [Medline] [Order article via Infotrieve]
  46. Reijnen, M.-J., Sladek, F. M., Bertina, R. M. & Reitsma, P. H. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 6300-6303 [Abstract]
  47. Briet, E., Wijnands, M. C. & Veltkamp, J. J. (1985) Ann. Intern. Med. 103, 225-226 [Medline] [Order article via Infotrieve]
  48. Mueller, C. R., Maire, P. & Schibler, U. (1990) Cell 61, 279-291 [Medline] [Order article via Infotrieve]
  49. Picketts, D. J., Lillicrap, D. P. & Mueller, C. R. (1993) Nat. Genet. 3, 175-179 [Medline] [Order article via Infotrieve]
  50. Huang, M. N., Hung, H. L., Stanfield-Oakley, S. A. & High, K. A. (1992) J. Biol. Chem. 267, 15440-15446 [Abstract/Free Full Text]
  51. Royle, N. J., Irwin, D. M., Kochinsky, M. L., MacGillivray, R. T. A. & Hamerton, J. L. (1987) Somatic Cell Mol. Genet. 13, 285-292 [Medline] [Order article via Infotrieve]
  52. Chow, B. K., Ting, V., Tufaro, F. & MacGillivray, R. T. (1991) J. Biol. Chem. 266, 18927-18933 [Abstract/Free Full Text]
  53. Bancroft, J. D., Schaefer, L. A. & Degen, S. J. F. (1990) Gene (Amst.) 95, 253-260
  54. MacGillivray, R. T. A. & Chow, B. K. C. (1991) Thromb. Haemostasis 65, 612 (abstr.)
  55. Maxam, A. M. & Gilbert, W. (1977) Proc. Natl. Acad. Sci. U.S.A. 74, 560-564 [Abstract]

©1996 by The American Society for Biochemistry and Molecular Biology, Inc.