(Received for publication, November 22, 1995; and in revised form, January 23, 1996)
From the
-Pyrroline-5-carboxylate dehydrogenase (P5CDh;
EC 1.5.1.12), a mitochondrial matrix NAD
-dependent
dehydrogenase, catalyzes the second step of the proline degradation
pathway. Deficiency of this enzyme is associated with type II
hyperprolinemia (HPII), an autosomal recessive disorder characterized
by accumulation of
-pyrroline-5-carboxylate (P5C) and
proline. As an initial step in understanding the biochemistry of human
P5CDh and molecular basis of HPII, we utilized published peptide
sequence data and degenerate primer polymerase chain reaction to clone
two full-length human P5CDh cDNAs, differing in length by 1 kilobase
pair (kb). Both cDNAs have the identical 1689-base pair open reading
frame encoding a protein of 563 residues with a predicted molecular
mass of 62 kDa. The long cDNA contains an additional 1-kb insert in the
3`-untranslated region that appears to be an alternatively spliced
intron. The conceptual translation of human P5CDh has 89% sequence
identity with the published human P5CDh peptide sequences and 42 and
26% identity with Saccharomyces cerevisiae and Escherichia
coli P5CDhs, respectively, as well as homology to several other
aldehyde dehydrogenases. Both P5CDh cDNA clones detect a single 3.2-kb
transcript on Northern blots of multiple human tissues, indicating the
long cDNA containing the 3`-untranslated intron represents the
predominant transcript. The P5CDh structural gene appears to be single
copy with a size of about 20 kb localized to chromosome 1. To confirm
the identity of the putative P5CDh cDNAs, we expressed them in a
P5CDh-deficient strain of S. cerevisiae. Both conferred
measurable P5CDh activity and the ability to grow on proline as a sole
nitrogen source.
P5C ()dehydrogenase (EC 1.5.1.12) is a mitochondrial
matrix NAD
-dependent enzyme catalyzing the
irreversible conversion of P5C, derived either from proline or
ornithine, to glutamate. This reaction is a necessary step in the
pathway interconnecting the urea and tricarboxylic acid cycles (1) . Human P5CDh is classified as a member of the aldehyde
dehydrogenase (ALDh) superfamily (2) on the basis of substrate
specificity and kinetic properties(3, 4, 5) .
The preferred substrate of P5CDh is glutamic
-semialdehyde, which
is in spontaneous nonenzymatic equilibrium with P5C. Other substrates
include succinic, glutaric, and adipic semialdehydes(3) . Human
P5CDh is a homodimer with a molecular mass of 142-175
kDa(3) .
The P5CDh genes of Escherichia coli(6) , Salmonella typhimurium(7) , and Saccharomyces cerevisiae(8, 9) have been cloned and sequenced. The E. coli putA gene is bifunctional encoding both P5CDh and proline oxidase. The S. cerevisiae PUT2 gene encodes P5CDh only. Mutant strains of S. cerevisiae lacking P5CDh activity (8) are unable to use proline as a sole nitrogen source and provide a system for expression and analysis of any putative human P5CDh. No molecular information regarding mammalian, or specifically human, P5CDh is available. In humans, deficiency of P5CDh causes HPII, an autosomal recessive inborn error of metabolism (10, 11) characterized by a 10-15 times accumulation of plasma proline (normal range 100-350 µM) and a 10-40 times accumulation of plasma P5C (normal range 0.2-2 µM)(1, 10, 11) . Although some adults with HPII appear to be normal, this disorder may be causally related to neurologic manifestations, including seizures and mental retardation(1, 12) .
Utilizing published human P5CDh peptide sequences and degenerate primer PCR, we cloned two full-length P5CDh cDNAs of 2139 and 3150 bp. Both clones encode an identical 563-amino acid protein; the longer contains an additional 1-kb insert in the 3`-untranslated region. Functional complementation of a S. cerevisiae put2 mutant confirms the identity of these cDNAs as human P5CDh cDNAs.
Figure 1: Comparison of the deduced amino acid sequence of human P5CDh to those of S. cerevisiae and E. coli P5CDhs. Residues identical in at least two out of three proteins are shown in black. The vertical arrowheads indicate potential sites of cleavage of the mitochondrial targeting sequence (see text). The overlines and Roman numerals indicate the published P5CDh peptide sequences(4) . The position of the degenerate PCR primers is indicated by the horizontal arrows. The brackets indicate the location of potential glycosylation sites.
To extend the 5`-end
of the P5CDh cDNA, we performed a 5` rapid amplification of the cDNA
end (RACE) with 2 µg of human HepG2 poly(A) RNA,
an antisense primer corresponding to the most 5`-region of the human
retinal P5CDh cDNA (5`-CAGGGCAGCCTCAATGGC-3`, primer 3) and a Clontech
5`-AmpliFinder RACE kit. Hybridization, clone isolation and sequencing
were performed as described for the human retinal P5CDh cDNA clone.
Human genomic DNA was isolated from human lymphocytes according to the method of Kunkel(18) . Genomic DNA (10 µg) was digested with EcoRI, BamHI or HindIII separated on a 0.8% agarose gel and transferred to GeneScreen plus membrane (DuPont NEN). DNA was cross-linked to the membrane using a UV-autocross linker (Stratagene). Hybridization and autoradiography were performed with the same conditions used for library screening. Chromosome mapping was done by using a PstI-digested monochromosomal somatic cell hybrid mapping panel (Oncor) as described in manufacturer's manual. All hybridizations were with human P5CDhS as the probe.
Figure 2: Diagrammatic representation of the composite full-length cDNAs encoding human P5CDh and the constructs used for functional complementation in S. cerevisiae. The open reading frame is indicated by the gray filled rectangle; the 3`-UTR insert by the diagonal-hatched rectangle; the 5`-UTR of human P5CDh cDNAs by the white rectangle; the S. cerevisiae PUT2 5`-UTR by the black rectangle; the S. cerevisiae phosphoglycerate kinase promoter by the wavy-lined rectangle; and S. cerevisiae PUT2 promoter by the checkered rectangle.
To replace the phosphoglycerate kinase promoter and 5`-UTR of the human P5CDh cDNAs with S. cerevisiae PUT2 5` sequence, we used a PCR-based cloning strategy to construct pHsP5CDhS2 and pHsP5CDhL2. Briefly, we made use of a NarI site 7 bp downstream of the predicted start ATG to delete the 5`-UTR of the human P5CDh cDNAs. A 3`-antisense primer, primer 8 (5`-CGCGGGCGCCGGCAGCAGCATAATTCCTGTGAATTTG), contains complementary sequences of human P5CDh cDNA at position +1 to +18 (including the NarI site at +10) and yeast PUT2 5`-UTR position -1 to -16. A 5`-primer, primer 9 (5`-CCCAAGCTTGATCCATTAAACTGGAAACAC), introduces a 5`-HindIII cloning site and corresponds to yeast PUT2 promoter sequence bp -435 to -414 relative to the translational start site. After a 30-cycle amplification, we gel-purified the 450-bp product fragment and substituted it for the HindIII-NarI fragment of the plasmids pHsP5CDhS1, pHsP5CDhL1 (see Fig. 2). The nucleotide sequences at the ligation junctions in plasmids pHsP5CDhS2, and pHsP5CDhL2 were confirmed by sequencing. pKB13, a plasmid containing S. cerevisiae PUT2 gene was a gift from M. Brandriss(19) . We subcloned the yeast PUT2 gene from pKB13 to pSM703 to form pScPUT2.
The open reading frame of these clones encodes a 563-amino
acid protein with a predicted molecular mass of 62 kDa. The overall
amino acid sequence of the putative human P5CDh has a 42 and 26%
identity to those of S. cerevisiae and E. coli,
respectively (Fig. 1). The predicted active site residues
(Glu and Cys
) are completely conserved
among these three proteins as well as five other members of the ALDh
family (Fig. 3A)(2, 6) . A region
corresponding to a possible NAD
-binding motif is also
highly conserved (Fig. 3B). There are three potential N-glycosylation sites (consensus = N-X-(S/T))
located at Asn
, Asn
, and Asn
of the unprocessed P5CDh protein (Fig. 1)(22, 23, 24) .
Figure 3:
Comparison of the amino acid sequence of
motifs in human P5CDh to that of P5CDh of S. cerevisiae and E. coli and five members of the ALDh family. A, the
sequence around the predicted active site residues E314 and C348; B, the sequence around the possible NAD binding motif. Residues are numbered to the left. Residues
identical in five or more of the sequences are shown in black. HsP5CDh, human P5CDh; HsALDh1, human ALDh type I; HsALDh2, human ALDh type II; HsALDh3, human ALDh type
III; RnALDh4, rat ALDh type IV; HsSSADh, human
succinate semialdehyde dehydrogenase; and RnSSADh, rat
succinate semialdehyde dehydrogenase. The asterisks indicate
completely conserved residues thought to be involved in the active site
and corresponding to E314 and C348 of HsP5CDh.
As expected
for a mitochondrial matrix protein, the N terminus of the putative
P5CDh has no acidic residues and several arginine, leucine, and serine
residues(25, 26) . Hendrick et al.(25) suggested that a fair predictor of the cleavage site
is the sequence R-X-()-X-X-(S)- where
the R is located at -10 relative to the cleavage site and
= a hydrophobic residue, usually F, V, L, or I. Additionally, an
R is often present at -2 relative to the cleavage site. With
these considerations, there are two possible mitochondrial leader
cleavage sites in the conceptual translation of P5CDh (indicated by the vertical arrowheads in Fig. 1). The more N-terminal
site has an R at -10, a W at -8, an A at -5, and an R
at -2; the more distal site has an R at -10, a K at
-8, an S at -5, and a K at -2. We favor the more
N-terminal site which would predict a leader sequence of 24 residues
containing 4 R, 6 L, and 1 S residues and would yield a processed P5CDh
of 539 amino acids.
Figure 4:
A, Northern blot analysis of P5CDh
expression in various human tissues. The same blot was probed with
radiolabeled P5CDhS (top) or human -actin (bottom). The position of size markers (kb) is
indicated. B, Southern blot analysis of human genomic DNA
digested with the indicated restriction enzymes and hybridized with
radiolabeled P5CDhS. The position of size marker (kb) is
indicated. C, monochromosomal hybrid somatic cell mapping
panel (Oncor) hybridized with radiolabeled P5CDhS. The asterisks denote the specific human fragments hybridizing to HsP5CDh; the arrow denotes the lane loaded with
somatic cell hybrid genomic DNA containing human chromosome 1 and
positive for P5CDh.
To determine if the structure of the genomic sequence encoding P5CDh matches the 3`-end of the P5CDhL cDNA, we compared the products of PCR amplification of genomic DNA and cDNA using primers flanking and/or at the ends of the insert. We found that when amplifying with either genomic DNA or P5CDhL cDNA as template and primer pairs 4 + 5, 4 + 7, 6 + 5, and 6 + 7, we detected 1,242-, 881-, 776-, and 415-bp product fragments, respectively, on both lanes (Fig. 5, lanes G and L). We detected only a 233-bp product fragment when using P5CDhS cDNA as template and primer pair 4 + 5 (Fig. 5, lane S). These results indicate that the 3`-UTR sequence of P5CDhL is colinear with genomic DNA and that the 1011-bp sequence is spliced out in P5CDhS.
Figure 5: Comparison of amplification products from the region surrounding the insert in the 3`-UTR of P5CDh using genomic DNA (G), or the long (L) or short (S) cDNAs as template. The diagram of the P5CDh cDNAs on the left shows the location of the various primers and predicted amplification product lengths. M, size markers
Using a monochromosomal somatic cell hybrid mapping panel and radiolabeled P5CDhS as probe, we mapped the human P5CDh structural gene to human chromosome 1 (Fig. 4C).
Figure 6:
Culture of S. cerevisiae strains
HV1, HV2, HV3, and HV4 on MGA-ura or
MGP-ura
for 7 days at 30 °C. See
``Experimental Procedures'' for strain
information.
To confirm that the complementing activity in strains HV2, HV3, and HV4 was due to the expression of a functional P5CDh, we assayed the activity of this enzyme in extracts of these cells. Substantial P5CDh activity was detectable in HV2, 4.4 nmol of product/h/mg (4.2-4.6) (mean (range)); HV3, 4.2 (4.0- 4.4); and HV4, 2.9 (2.6-3.5). No activity was detected in yeast transformed with vector alone (strain HV1). We concluded that the protein encoded by pHsP5CDhS2 or pHsP5CDhL2 is human P5CDh and that the human enzyme is able to function in the heterologous yeast system.
Using published peptide sequences and degenerate primer PCR, we isolated two types of cDNAs encoding human P5CDh. Both encode the identical 563 amino acid protein but differ in their 3`-UTR with one containing a 1011 bp insert. The protein predicted by conceptual translation of these cDNAs has excellent agreement with the P5CDh peptide sequences published by Hempel et al.(4) ; all the peptides are recognizable with an overall identity of 89% (93/105 residues) with 7 of the 12 mismatched residues located at the C termini of three peptides. As expected for a protein found in the mitochondrial matrix, the N-terminal region of human P5CDh has features characteristic of a mitochondrial targeting sequence. The most likely cleavage site yields a leader of 24 amino acids and a mature P5CDh of 539 amino acids with a molecular mass of 59 kDa which is in excellent agreement with published values for human and rat P5CDh(3, 5, 29) . Expression studies in yeast mutants lacking P5CDh activity confirm that the encoded protein is human P5CDh.
Human P5CDh has significant sequence similarity to the
orthologous protein of lower eukaryotes and prokaryotes (26-42%
identity) as well as to other aldehyde dehydrogenases confirming that
it is a member of the aldehyde dehydrogenase superfamily. The
functional significance of the highly conserved residues common to
these proteins, in particular the putative active site residues and
NAD-binding motifs, will require additional studies
with site-directed mutagenesis and functional assays. The human P5CDh
structural gene is located on chromosome 1, at least 20 kb in size and
appears to be a single copy with a relatively simple structure as
indicated by the low number of hybridizing fragments detected in
genomic Southern blot analysis.
Several lines of evidence indicate that the 1-kb insert in P5CDh is an alternatively spliced intron. First, the insert sequence has consensus sequences for donor, branch point and acceptor splice sites at 5`- and 3`-termini, respectively. Second, we found that human P5CDh cDNA is colinear with genomic DNA over the area containing the insert (Fig. 5). We found cDNAs encoding both the long and short forms in kidney and retinal cDNA libraries and we were able to detect both the long and short forms by reverse transcription and PCR amplification of HepG2 cell RNA (data not shown). However, our Northern blot data show that the transcript containing the insert is the predominant P5CDh mRNA (>95%) in all tissues examined. These results suggest that either this 3`-UTR intron is not removed from most transcripts or that its presence enhances mRNA stability so that the long transcripts preferentially accumulate. The 5`-splice site (AG gtggcc . . .) is not ideal particularly at the -4 to -6 positions with a ``consensus value'' of 0.66 where the value for typical introns ranges from 0.7-1.0 with a mean of 0.83(27, 30, 31) . This suggests that this intron is a poor substrate for splicing. Nevertheless, additional studies of transcript splicing and stability will be required to discriminate between these two possibilities. Other human cDNAs with alternatively spliced introns in the 3`-UTR have been described (32) .
Yeast systems have proven valuable for the
study of human genes, particularly those encoding enzymes(33) .
For example, human ornithine aminotransferase(34) , P5C
reductase(17) , galactose-1-phosphate
uridyltransferase(35) , and cystathionine synthase (36) have all been shown to complement the growth phenotype of
the relevant mutant strain. Our results show that expression of
chimeric P5CDh cDNAs comprised of the 5`-UTR of the yeast PUT2 gene and the human open reading frame of P5CDh clearly complement
the growth phenotype of put2 yeast and express P5CDh activity
in the range observed for human fibroblasts(11) . The failure
to obtain this result with the complete human cDNAs may be explained by
differences in the 5`-UTR of the two species. The PUT2 transcript has a typical S. cerevisiae 5`-UTR with an
abundance of A nucleotides (47%), a scarcity of G nucleotides (11%) and
an A in the -3 position relative to the ATG codon(28) .
By contrast, the 30-nucleotide 5`-UTR encoded by our human P5CDh cDNAs
has 23% A, 20% G, and a G at the -3 position. A similar benefit
of substituting a yeast 5`-UTR to obtain maximal expression of a human
cDNA has been observed by others(37) .
Biochemical studies suggest that P5CDh mutations are responsible for HP II, an inborn error of proline catabolism(1) . Availability of the human P5CDh cDNAs described in this work will allow us to determine the molecular basis of this disorder. Furthermore, the ability to complement yeast strain MB1472 (put2) lacking P5CDh activity with the human P5CDh cDNA clones will provide an assay system to examine the functional consequences of mutations in human P5CDh.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U24266 [GenBank]and U24267[GenBank].