(Received for publication, September 7, 1994; and in revised form, November 2, 1994)
From the
Human alcohol dehydrogenase (ADH) consists of a family of five evolutionarily related classes of enzymes that collectively function in the metabolism of a wide variety of alcohols including ethanol and retinol. Class IV ADH has been found to be the most active as a retinol dehydrogenase, thus it may participate in retinoic acid synthesis. The gene encoding class IV ADH (ADH7) has now been cloned and subjected to molecular examination. Southern blot analysis indicated that class IV ADH is encoded by a single unique gene and has no related pseudogenes. The class IV ADH gene is divided into nine exons, consistent with the highly conserved intron/exon structure of other mammalian ADH genes. The predicted amino acid sequence of the exon coding regions indicates that a protein of 373 amino acids, excluding the amino-terminal methionine, would be translated, sharing greater sequence identity with class I ADH (69%) than with classes II, III, or V (59-61%). Expression of class IV ADH mRNA was detected in human stomach but not liver. This correlates with previous protein studies, which have indicated that class IV ADH is the major stomach ADH but unlike other ADHs is absent from liver. Primer extension studies using human stomach RNA were performed to identify the transcription initiation site lying 100 base pairs upstream of the ATG translation start codon. Nucleotide sequence analysis of the promoter region indicated the absence of a TATA box sequence often located about 25 base pairs upstream of the start site as well as the absence of GC boxes, which are quite often seen in promoters lacking a TATA box. The class IV ADH promoter thus differs from the other ADH promoters, which contain either a TATA box (classes I and II) or GC-boxes (class III), suggesting a fundamentally different form of transcriptional regulation.
Vertebrate alcohol dehydrogenase (EC 1.1.1.1) (ADH) ()exists as a family of enzymes divided into several
classes. Five classes of ADH have been identified in humans including
class I ADH, the typical liver ADH, which efficiently oxidizes ethanol
to acetaldehyde and is responsible for most of the ethanol metabolism
(Jörnvall et al., 1993). The complete
amino acid sequences of all five classes of human ADH have revealed
sequence identites ranging from 57 to 69% (Satre et al.,
1994). The evolutionary progenitor of all vertebrate ADH classes (as
well as all other ADHs found in other organisms) has been found to be
class III ADH, which exists ubiquitously in all organisms analyzed
(Danielsson and Jörnvall, 1992; Danielsson et
al., 1994). Class III ADH does not function in ethanol metabolism
but instead acts as a glutathione-dependent formaldehyde dehydrogenase
needed to remove formaldehyde normally produced by some metabolic
reactions (Koivusalo et al., 1989). Gene duplications from an
ancestral class III ADH gene and subsequent mutational divergence have
evidently given rise to several ADHs with enzymatic functions other
than formaldehyde metabolism such as ethanol and retinol metabolism.
The ability of ADH to act as a retinol dehydrogenase (Boleda et al., 1993; Yang et al., 1994) implies that it may participate in the synthesis of retinoic acid, the active form of vitamin A involved in regulating epithelial cell differentiation (Connor, 1988; De Luca, 1991). Retinoic acid is derived from retinol via two oxidation steps with retinal as the intermediate (Connor and Smit, 1987; Kim et al., 1992). Of the various classes of human ADH examined for retinol dehydrogenase activity in vitro, class IV ADH has been found to have the highest efficiency, with less activity attributed to ADH classes I and II, and no activity for class III (Yang et al., 1994). The inefficiency of class IV ADH in ethanol oxidation (Moreno and Parés, 1991; Stone et al., 1993) further suggests it has evolved to perform a role in the metabolism of other alcohols such as retinol. In addition to the forms of ADH that may function as cytosolic retinol dehydrogenases, a microsomal retinol dehydrogenase distinct from ADH has also been identified that can oxidize retinol in vitro (Posch et al., 1991). The relative importance of these various retinol dehydrogenases in vivo remains to be determined.
In order to learn more about class IV ADH, we have initiated molecular genetic studies. A cDNA encoding human class IV ADH has been previously described (Satre et al., 1994). We have now analyzed genomic clones encoding the class IV ADH gene, which we refer to as ADH7. The intron/exon structure was determined, and the promoter region was identified. We also show that the class IV ADH gene, unlike other ADH genes, is expressed much higher in the stomach than in the liver.
Southern blotting hybridization conditions were as described above
and was performed with a class IV ADH cDNA probe radiolabeled as
described above or with oligonucleotide probes labeled on the 5` end
with [-
P]ATP and polynucleotide kinase as
described (Sambrook et al., 1989). Human genomic DNA used for
Southern blot analysis was purchased from Clontech Laboratories, Inc.
Figure 1:
Restriction map of human class IV ADH
gene. Restriction mapping and sequencing of the class IV ADH gene
revealed that it is divided into 9 exons covering about 23 kb. The
cleavage sites for several restriction endonucleases are indicated. The
5` end of the gene containing exons 1 and 2 was cloned in MZ5,
which has a 17.3-kb insert of human DNA. Exons 2-8 were cloned in
MZ9 containing 15.7 kb of human DNA of which 1.1 kb overlaps with
MZ5. Exon 9 was cloned (by polymerase chain reaction) in pPCRexon9
containing 1.2 kb of human DNA of which 0.1 kb overlaps with
MZ9.
Exon 9 and the 3`-untranslated region (which contains a PstI
restriction site) are present within pMS7, a human class IV ADH cDNA
(Satre et al., 1994).
Figure 2: Nucleotide sequences of exons for the human class IV ADH gene ADH7. The nucleotide sequences for all nine exons are shown as well as some sequence upstream and downstream of each. The locations of the eight introns are indicated with their approximate sizes in parentheses. The 5` and 3` ends of each intron contained the conserved GT/AG splice site sequences indicated with asterisks. The transcription initiation site, inferred from primer extension analysis in Fig. 5, is shown at position +1 at an adenine labeled with a closed circle. Upstream of the transcription start site, two potential transcription factor binding sites in the promoter (AP-1 and C/EBP) are indicated based upon consensus sequence matches. Downstream of the transcription start site, a TATA box in the reverse orientation (rev TATA box) is shown. The predicted amino acid sequence of the class IV ADH coding region (downstream of the initiator methionine at position +101) is numbered according to the homology with class I ADH, which is used for numbering all vertebrate ADH sequences (Jörnvall et al., 1987). The region downstream of the initiator methionine is actually 373 amino acids in length, one shorter than class I ADH due to the apparent deletion of codon 118, which is noticed when the sequences of all classes of human ADH are aligned (Satre et al., 1994). The stop codon is indicated by a triangle, and 64 bp of the 3`-untranslated region are shown.
Figure 5:
Primer extension analysis on 5` end of
human class IV ADH mRNA. A 1.0-µg sample of human stomach
poly(A) RNA (St) was subjected to primer
extension using avian myoblastosis virus reverse transcriptase and a
specific primer consisting of the first 21 bp of noncoding strand
sequence upstream of the ATG translation start codon in the human class
IV ADH gene. This primer was also used to sequence the noncoding strand
of a genomic clone containing the 5`-untranslated region and exon 1 (G, guanine; A, adenine; T, thymine; C, cytosine). The major primer extension product from stomach
RNA, indicated by an arrow, is shown by the DNA sequencing
ladder to fall at a T residue on the noncoding strand, corresponding to
an A residue on the coding strand. This A residue (labeled +1 in Fig. 2) is located exactly 100 bp upstream of the ATG
translation start codon.
The sequences of all of the introns at the intron/exon junctions conformed to the GT/AG rule (Breathnach and Chambon, 1981). The positions at which the introns interrupted the coding region were identical to those observed in the other human ADH genes previously analyzed (Duester et al., 1986; Matsuo and Yokoyama, 1989; Von Bahr-Lindström et al., 1991; Yasunami et al., 1991; Hur and Edenberg, 1992). The sizes of the introns for the human class IV ADH gene as well as the other known mammalian ADH genes are summarized (Table 1). Except for human class V, which lacks intron 8 (and exon 9), all the genes contain 8 introns that interrupt the coding sequence at the same locations. This indicates that all classes of mammalian ADH are derived from duplication of an ancient gene that already possessed the intron/exon structure shown in the class IV ADH gene. Presumably, the class V ADH gene lost exon 9 subsequent to its divergence from other ADH genes.
The amino acid sequence of human class IV ADH predicted from the coding region indicated that the translated protein would contain 373 amino acids downstream of the initiator methionine (Fig. 2). The predicted amino acid sequence of human class IV ADH was aligned with the recently determined full-length sequence of the rat class IV ADH enzyme (Parés et al., 1994), indicating a rat/human interspecies sequence identity of 87.1%. The human class IV ADH sequence was also aligned with the sequences of the other human classes. The human interclass sequence identity was greater with class I ADH (69%) than with classes II, III, or V (59-61%). This greater interclass sequence identity between classes I and IV was also noticed in a comparison of rat class IV ADH with the other classes of rat ADH (Parés et al., 1994). Thus, it is reasonable to propose that ADH classes I and IV may have diverged from a common ancestral gene subsequent to its divergence from the other classes. Since classes I and IV do share a common catalytic function as retinol dehydrogenases in several mammalian species (Connor and Smit, 1987; Boleda et al., 1993; Yang et al., 1994), this further suggests a common origin.
Figure 3: Analysis of class IV ADH gene structure in human genomic DNA. Human genomic DNA (5 µg) was digested with KpnI (A) or PstI (B) and analyzed by Southern blot analysis using as a probe the human class IV ADH cDNA containing the region from codon 8 to the 3`-untranslated region (Satre et al., 1994). The sizes in kilobase pairs of the DNA fragments able to hybridize to the class IV ADH cDNA are indicated. A shorter autoradiographic exposure allowed a clearer definition of the three KpnI fragments in laneA (data not shown).
Since all hybridizing DNA fragments in human genomic DNA digested with either KpnI or PstI correlated with the class IV ADH gene cloned, we assume that class IV ADH is encoded by a single unique gene and has no closely related pseudogenes. No pseudogenes were detected in the genomic clones analyzed as well. Of the five classes of mammalian ADH that have been analyzed, only class I ADH has been shown to be encoded by more than one gene in some species; i.e. human class I ADH is encoded by three closely related genes ADH1, ADH2, and ADH3, which cross-hybridize under high stringency (Duester et al., 1986). Only class III ADH (the most ancient form of ADH) has been demonstrated to possess pseudogenes; these were found to be processed pseudogenes (Matsuo and Yokoyama, 1990; Hur and Edenberg, 1992).
Figure 4:
Expression of class IV ADH mRNA in human
tissues. Samples (0.5 µg) of poly(A) RNA from
human liver (A and C) and human stomach (B and D) were separated by formaldehyde-agarose gel
electrophoresis and subjected to Northern blot analysis. LanesA and B were hybridized to the human ADH3 cDNA (a class I ADH gene), whereas lanesC and D were hybridized to the human class IV ADH cDNA. The sizes of
the RNAs detected are in kilobases.
The size of the mRNA for human stomach class IV ADH, estimated here at 2.3 kb by Northern blotting, correlates approximately with the size of the cDNA reported by Farrés et al.(1994), which is 2055 bp in length with a putative poly(A) signal near the 3` end but no poly(A) stretch. A 5`-untranslated region of 60 bp was reported in that cDNA, but our primer extension studies (discussed below) indicate that the mature mRNA contains 100 bp in the 5`-untranslated region. This would increase the predicted size of the mRNA to about 2095 bp plus the poly(A) stretch, which is generally 100-200 residues.
The sequence downstream of this transcription initiation site (designated as position + 1 bp) contains two ATG triplets (one at +1 and one at +65 bp) prior to the ATG designated as the translation initiation codon at +101 (Fig. 2). Both of the upstream ATGs are flanked by thymines at critical positions (TNNATGT) and are thus classified as nonfunctional for translation initiation (Kozak, 1987). The ATG at position +101 is flanked by an adenine and guanine at these critical positions (ANNATGG) and is one of the most common functional translation initiation codons. The ATG at position +1 is also part of the transcription initiation site and is probably subjected to 5`-capping with 7-methylguanosine, making it even further less likely to be involved in translation initiation. Interestingly, the transcription initiation site for class IV ADH (5`-TCTATGT-3`) has a very similar sequence to the initiation sites previously observed in the three human class I ADH genes (ADH1, ADH2, and ADH3), which were all shown to initiate at an adenine within an ATG surrounded by pyrimidines: (5`-TTTATGC-3`) (Stewart et al., 1990).
Computer analysis of the DNA sequence from -496 to +100 bp was performed to scan for potential binding sites of several common transcription factors that may be part of the promoter (Fig. 2). A TATA box was not observed just upstream of the transcription initiation site near position -25 bp, which is its usual location if present (Breathnach and Chambon, 1981). However, we did observe the sequence TATATAA located from +28 to +22 bp downstream of the start site in the reverse orientation on the noncoding strand. Since the adenovirus IVa2 promoter has been shown to have a functional TATA box located about 20 bp downstream of the start site in the reverse orientation (Carcamo et al., 1990), the downstream TATA we observe in class IV ADH may also function in an unusual type of initiation event. We did not observe any GC-box binding sites (GGGCGG) for the transcription factor Sp1, which are often present in genes lacking a TATA box, nor other common sites such as those binding transcription factor CTF/NF1 (GCCAAT) and octamer-binding proteins (ATTTGCAT) (Mitchell and Tjian, 1989). This promoter also does not possess the initiator element (YAYTCYYY) often found overlapping with the site of transcription initiation, which can be found in genes that either contain or lack a TATA box (Roeder, 1991).
The class IV ADH promoter was found to contain a site at -43 bp matching the TGA(C/G)TCA consensus sequence for the transcription factor AP-1 (Mitchell and Tjian, 1989), as well as a site at -35 bp matching the T(T/G)NNG(T/C)AA(T/G) consensus sequence for C/EBP (De Simone and Cortese, 1992) (Fig. 2). The lack of an upstream TATA box at -25 bp or GC boxes to help direct the site of transcription initiation suggests that the AP-1 and C/EBP sites lying adjacent to each other between -43 and -27 bp may function together or separately to help direct transcription initiation for class IV ADH. Alternatively, this promoter may use the downstream TATA box or a novel mechanism. In this respect, the class IV ADH promoter is unlike the other ADH promoters characterized, which have either an upstream TATA box, i.e. the promoters for class I ADH (Duester et al., 1986; Stewart et al., 1990) and class II ADH (Von Bahr-Lindström et al., 1991), or GC boxes clustered near the start site, i.e. the promoter for class III ADH (Hur and Edenberg, 1992).
The presence of a C/EBP site in the class IV ADH promoter between -35 and -27 bp is similar to what was noticed for the three human class I ADH promoters that have previously been demonstrated to possess functional C/EBP sites between -45 and -37 bp (Van Ooij et al., 1992). However, the class I ADH promoters also contain a TATA box just downstream of this C/EBP site, and they contain additional C/EBP sites further upstream. Thus, if C/EBP plays a role in class IV ADH transcription, it may function differently than it does in class I ADH transcription where several related forms of C/EBP have been demonstrated to play a role in liver transcription (Van Ooij et al., 1992). The lack of class IV ADH expression in the liver further suggests a fundamentally different form of regulation than that seen for class I ADH genes. Finally, the class IV ADH promoter did not possess a retinoic acid response element in the region analyzed, thus differing from the human class I ADH gene ADH3, which was found to possess this element at -300 bp (Duester et al., 1991).
In summary, human class IV ADH was shown to be encoded by a single gene whose coding region is most closely related to that of class I ADH. However, the expression pattern of class IV ADH differs markedly from class I ADH. The mRNA for class IV ADH was shown to be much more abundant in stomach than liver, thus differing from the mRNA for class I ADH and other known ADHs, which are all actively expressed in liver. The promoter for class IV ADH is much different than those seen in other ADH genes, lacking many of the common sequence elements that direct transcription initiation and thus representing an unusual type of promoter. The cloning of the class IV ADH gene now enables us to use molecular genetic techniques to further study its expression patterns, gene regulation, and function, particularly its role in retinoid metabolism as a retinol dehydrogenase.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U16286[GenBank]-U16293[GenBank].