(Received for publication, October 30, 1995)
From the
Multiple AE2 Cl/HCO
exchanger mRNAs have been identified in rat. To determine the
genetic basis for these mRNAs and whether they encode different
variants of the exchanger, we used both rapid amplification of cDNA
ends and S1 nuclease protection protocols and examined the organization
of the gene. mRNAs encoding three N-terminal variants of AE2 (AE2a,
AE2b, and AE2c) were identified and shown to be transcribed from
alternative promoters. The AE2a transcription unit consists of 23
exons, with exons 1 and 2 containing 5`-untranslated sequence and the
first 17 codons. The first exon of AE2b is located in intron 2; it
contains 5`-untranslated sequence and an alternative 3-amino acid
N-terminal coding sequence and is spliced to exon 3. The first exon of
AE2c is located in intron 5; it consists of 5`-untranslated sequence
and is spliced to exon 6, which contains the translation initiation
codon corresponding to Met-200 of AE2a. Northern analysis shows that
AE2a is expressed in all tissues, AE2b exhibits a more restricted
distribution with highest levels in stomach, and AE2c is expressed only
in stomach. Thus, the use of alternative promoters leads to the
production of three N-terminal variants of AE2 that exhibit
tissue-specific patterns of expression.
Electroneutral Cl/HCO
exchange in mammalian tissues is mediated by a family of proteins
encoded by multiple mRNAs from at least three genes, termed AE1, (
)AE2, and AE3 (reviewed in (1, 2, 3) ). The AE1 gene encodes both
erythrocyte band 3 (4) and kidney band
3(5, 6, 7, 8) , an N-terminal
truncated variant of the exchanger. Kidney AE1 mRNA is transcribed from
an alternative promoter located in the third intron of the erythrocyte
transcription unit(7) , and utilizes a Met codon in exon 5 as
the translation start site(7, 8) . In the case of AE3,
a 4.4-kb mRNA encoding a 1227-amino acid variant is expressed in brain
and several other tissues(9, 10) , and a 3.8-kb mRNA
is expressed in
heart(9, 10, 11, 12, 13) .
The cardiac AE3 mRNA, which encodes a 1030-amino acid N-terminal
variant of the exchanger, is transcribed from an alternative promoter
located in intron 6 of the brain transcription
unit(11, 12, 13) . Thus, for both AE1 and
AE3, the use of tissue-specific alternative promoters leads to the
production of at least two proteins that differ in their N-terminal
sequences.
Full-length cDNAs encoding what appears to be a single
variant of AE2 have been cloned from multiple tissues and four
mammalian species (10, 14, 15, 16, 17) ;
however, Northern blot analyses suggest that there are at least three
different AE2 mRNAs(10) . These include a ubiquitous 4.4-kb
mRNA that corresponds to the form that has already been cloned, a
4.2-kb mRNA that is particularly abundant in stomach but is also
present at lower levels in some other tissues, and a 3.8-kb mRNA that
is expressed only in stomach. Because AE2 mRNAs are expressed in all
tissues, it is likely that AE2 plays a housekeeping role such as the
regulation of intracellular pH or cell volume. There is evidence,
however, that it also serves more specialized functions in polarized
epithelial cells. In stomach, AE2 is expressed at high levels on the
basolateral membrane of gastric parietal cells(18) , where it
presumably mediates the exchange of Cl and
HCO
that occurs during acid secretion
across the apical membrane. In small intestine, it has been localized
on the apical membranes of both villus and crypt enterocytes in
ileum(16) , consistent with a role in both NaCl absorption and
HCO
secretion. The mechanisms underlying
the large variations in AE2 mRNA expression levels in different tissues
and the mechanisms by which the protein is localized to either apical
or basolateral membranes are unknown.
Interestingly, the size
variations among the AE2 mRNAs seem to be due to differences at their
5` ends(10) , consistent with the possibility that they are
derived by the use of alternative promoters and encode protein variants
that differ in their N-terminal sequences. If the multiple AE2 mRNAs
are produced by the use of alternative promoters, it could contribute
to the regulation of Cl/HCO
exchange in polarized epithelia and other tissues, either as a
mechanism for producing variations in the primary structure of the
protein, which in turn could lead to differences in membrane location
or functional properties, and/or as a mechanism for transcriptional
control. Therefore, we considered it important to determine the primary
structures of the proteins encoded by the multiple AE2 mRNAs, their
genetic basis, and their tissue-specific patterns of expression. The
results of this study demonstrate that the AE2 gene contains three
distinct promoters, which exhibit significant differences in their
tissue specificity, and that the use of these promoters leads to the
production of four mRNAs encoding three N-terminal variants of the
exchanger.
Figure 1: PCR cloning of the 5` end of the AE2b mRNA. First strand rat stomach cDNA was prepared, ligated with the anchor oligonucleotide, and analyzed by 5` RACE (see ``Experimental Procedures''). Upper left, DNA size markers and RACE products were fractionated by agarose gel electrophoresis and visualized by ethidium bromide staining. Note the discrete 470- and 290-bp products, which are derived from the AE2a and AE2b mRNAs, respectively. Upper right, the AE2a and AE2b RACE products from the first set of PCR amplifications were isolated, reamplified in separate reactions, and analyzed by agarose gel electrophoresis. Each product was isolated, subcloned, and sequenced. Bottom panel, nucleotide and deduced amino acid sequences of the 5` ends of the AE2a and AE2b mRNAs. The translation initiation codon is underlined, and the first nucleotide of exon 3, which begins the common sequence is indicated by an arrow.
For amplification of the 5` ends of AE2a and AE2b described in Fig. 1, the P1 primer was complementary to nt +235 to +255, and the P2 primer was complementary to nt +197 to +214 of the rat AE2a coding sequence (10) . For amplification of the 5` end of AE2c described in Fig. 2, the P1 primer was complementary to nt +1047 to +1065, and the P2 primer was complementary to nt +674 to +699. After fractionation by agarose gel electrophoresis, PCR products were visualized by ethidium bromide staining. Specific PCR products were captured on NA45 paper (Schleicher & Schuell), eluted, and reamplified under the same conditions used for the first round of PCR. The reamplified products were subcloned into the pCR(TM)II plasmid vector (Invitrogen), and sequence analysis was performed by the chain termination method.
Figure 2: PCR cloning of the 5` end of the AE2c mRNA. First strand rat stomach cDNA was prepared, ligated with the anchor oligonucleotide, and analyzed by 5` RACE (see``Experimental Procedures''). Upper left, DNA size markers and RACE products were fractionated by agarose gel electrophoresis and visualized by ethidium bromide staining. Note the discrete 380- and 320-bp products, which are derived from the AE2c and AE2a/b mRNAs, respectively. Upper right, AE2c and AE2a/b RACE products from the first set of amplifications were isolated, reamplified in separate reactions, and analyzed by agarose gel electrophoresis. Each product was isolated, subcloned, and sequenced. Bottom panel, nucleotide and deduced amino acid sequence of 5` end of the AE2c mRNA and corresponding sequences of the AE2a/b mRNAs. The apparent translation initiation codon for AE2c is underlined, and the first nucleotide of exon 6, which begins the common sequence, is indicated with an arrow.
For amplification of the 5` end of AE2a
described in Fig. 5, the P1 primer was complementary to nt
+32 to +52, and the P2 primer was complementary to nt
+17 to +36 (numbered as in top panel of Fig. 7; see also Fig. 5). In this experiment the P2
primer was 5` end-labeled with T4 kinase and
[-
P]ATP before being used in the PCR
reaction. PCR products were fractionated by electrophoresis on a 6%
polyacrylamide gel and visualized by autoradiography. Regions of the
gel containing PCR products were excised, and the PCR products were
eluted, reamplified with unlabeled primer, and subcloned into the
pCR(TM)II plasmid vector. Bacterial colonies harboring plasmids with
AE2 sequences were identified using a
P-labeled
oligonucleotide probe from near the 5` end of the AE2a mRNA
(complementary to nt +7 to +22 as numbered in top panel of Fig. 7). Forty-eight positive clones were isolated and
sequenced by the chain termination method.
Figure 5:
S1 nuclease and RACE analysis of the AE2a
transcription initiation site. Left panel, nuclease protection
was performed using a 5` end-labeled probe, 5 µg of rat stomach
mRNA or tRNA control, and 50, 100, or 200 units of S1 nuclease (see
``Experimental Procedures''). Samples were analyzed by
polyacrylamide gel electrophoresis and autoradiography. Markers in
first two lanes are purine (A+G) or pyrimidine (C+T) ladders
generated by chemical cleavage sequencing of the probe. Undigested
probe is shown in the last lane. Nucleotides are numbered on the left, with +1 indicating the beginning of the major
transcription initiation site corresponding to a cluster of four
protected fragments (delineated by the bracket and asterisk at the right). Right panel, RACE
analysis. cDNA was synthesized using rat stomach mRNA and primer 1,
ligated to the anchor oligonucleotide, PCR-amplified using P-labeled primer 2 and the anchor primer, and aliquots of
the RACE reaction mixture were analyzed by polyacrylamide gel
electrophoresis and autoradiography (see ``Experimental
Procedures''). Numbers on the right indicate the
estimated termination site of the extension products, with +1
corresponding to the major initiation site determined by S1 analysis.
Size estimates were obtained by running sequencing ladders in adjacent
lanes and correcting for the size of the anchor sequence. Products in
the bracketed regions labeled 1-5 were excised
from the gel, reamplified, subcloned, and sequenced. 70% of the RACE
products terminated within the region spanning nt +1 to +4. Bottom panel, S1 nuclease and RACE strategies. The region
shown is from the 5` end of the AE2a transcription unit, with #
indicating the 5` end of the rat AE2a cDNA(10) . The S1
nuclease probe and primers used for RACE analysis are labeled and indicated by lines. The asterisks above and
the heavy bars below the sequence indicate major initiation
sites identified by S1 nuclease and RACE analysis, respectively.
Sequences included in the major transcripts are shown in uppercase
letters and 5` flanking sequences are shown in lowercase
letters.
Figure 7: Nucleotide sequences of the regions flanking alternative exons 1a, 1b, and 1c of the AE2 gene. Upper panel, exon 1a and flanking regions. Middle panel, exon 1b and flanking regions. Bottom panel, exon 1c and flanking regions. Exon sequences are shown in uppercase letters; 5` flanking and intron sequences are shown in lowercase letters; amino acids are shown above the corresponding codons. Transcription initiation sites determined by S1 nuclease analysis are marked with an asterisk above the nucleotide, and the 5`-most nucleotide identified by RACE analysis are either underlined with a heavy bar or indicated with a caret under the nucleotide. The 5`-most nucleotide of AE2a identified by cDNA cloning (10) is indicated with a black diamond. Potential transcription factor binding sites (see text) are underlined and labeled. Nucleotides are numbered with +1 corresponding to the major transcription initiation site identified by S1 nuclease and RACE analyses and negative numbers indicating 5`-flanking regions.
Figure 3:
Northern blot analysis of alternative AE2
mRNAs in rat stomach. A blot containing 10 µg of rat stomach
poly(A) RNA was hybridized sequentially with probes
either common to all transcripts of the AE2 gene or specific for
individual transcripts of the AE2 gene. The probes (see
``Experimental Procedures'') used for each hybridization are: lane 1, a SacI-PvuII cDNA fragment common to
all AE2 mRNAs; lane 2, AE2a-specific probe; lane 3,
AE2b-specific probe; lane 4, AE2c-specific probe; lane
5, probe containing sequences from the 3` end of intron 5 and
located between exon 1c and exon 6. Autoradiographic exposure times
(left to right) were: 3, 6, 5, 14, and 2 days. Sizes of the mRNAs (in
kb) are indicated on the right.
Figure 8:
Tissue distribution of AE2 mRNAs. A
Northern blot with 5 µg of poly(A) RNA from the
indicated rat tissues was analyzed as described under
``Experimental Procedures'' with the following probes (from
top to bottom): SacI-PvuII fragment common to all AE2
mRNAs, AE2a-specific probe containing sequences from the first two
exons of AE2a, AE2b-specific probe containing sequences from exon 1b,
and AE2c-specific probe containing sequences from exon 1c.
Autoradiographic exposure times (from top to bottom)
were 6, 8, 12, and 10 days, respectively. The position of a 4.4-kb size
marker is shown on the right.
To clone the 5` end of the AE2b
transcript, a RACE cloning protocol was employed. First strand cDNA was
synthesized using rat stomach poly(A) RNA and a primer
from exon 4. The cDNA was then ligated to the anchor oligonucleotide
and PCR-amplified using the anchor primer and a primer from exon 3. Two
products, 470 and 290 bp in length, later shown to correspond to the 5`
ends of the AE2a and AE2b mRNAs, respectively, were visible over
background when the RACE reaction mixture was analyzed by agarose gel
electrophoresis and ethidium bromide staining (Fig. 1, top
left). These products were isolated from the gel, reamplified in
separate reactions, and again analyzed by agarose gel electrophoresis (Fig. 1, top right). The 470- and 290-bp products were
then isolated from the second gel, subcloned, and sequenced.
The two RACE products differed in sequence at their 5` ends but were identical at their 3` ends, with the common sequence beginning at the acceptor site of exon 3 (Fig. 1, bottom panel). As expected, the 470-bp product was derived from the AE2a mRNA; it contained sequences from exons 1, 2, and 3 of the AE2a transcription unit, with the first 17 codons occurring in exon 2. The 290-bp product was derived from an mRNA, subsequently shown to be the AE2b mRNA, in which exon 3 was immediately preceded by a unique 5` sequence. Sequence analysis of seven independent AE2b subclones revealed that they extended to a position ranging between 52 and 62 bp 5` to exon 3, with four subclones containing the entire sequence. The unique sequence, termed exon 1b, begins with a 53-nt 5`-untranslated sequence and ends with a 9-nt sequence that is headed by a potential initiation methionine codon. The 3-codon open reading frame at the end of exon 1b is in-frame with the coding sequence of exon 3. If this ATG codon serves as the translation initiation site, then the AE2b mRNA would encode an N-terminal variant of AE2 in which the first 17 amino acids of AE2a are replaced by an alternative 3-amino acid sequence.
The 5` end of the 3.8-kb AE2c mRNA was cloned using a similar RACE protocol. First strand rat stomach cDNA was synthesized using a primer from exon 8, ligated to the anchor oligonucleotide, and then PCR-amplified using the anchor primer and a primer from exon 6. Fractionation of the RACE products by agarose gel electrophoresis revealed a broad smear containing distinct bands at approximately 320 and 380 bp (Fig. 2, upper left), which were within the expected size range of the AE2c product. Each product was isolated and reamplified, analyzed by agarose gel electrophoresis (Fig. 2, upper right), and then subcloned and sequenced. The two products were identical at their 3` ends, with the common sequence corresponding to exon 6, but they differed at their 5` ends (Fig. 2, bottom panel). The 5` end of the 320-bp product terminated within exon 5 and consisted of sequences identical to common regions of AE2a and AE2b. The most likely explanation for this result is that it was due to premature termination of reverse transcriptase activity during first strand cDNA synthesis. The 380-bp product contained a 210-nt sequence at its 5` end, designated exon 1c, that did not correspond to previously characterized AE2 sequences. Exon 1c does not contain an ATG triplet and would therefore serve as 5`-untranslated sequence. The first ATG triplet of the 3.8-kb AE2c mRNA, which is in an acceptable context for initiation of translation, occurs in exon 6 and corresponds to Met codon 200 of AE2a. Thus, the AE2 variant encoded by this mRNA would lack the first 199 amino acids occurring in AE2a.
Figure 4: Organization of the rat AE2 gene. The alternative first exons, labeled 1a, 1b, and 1c, are shown in hatched vertical boxes, and the remaining exons are shown in black vertical boxes. Open regions represent introns. Transcription initiation sites are indicated with arrows, and ATG translation initiation codons are labeled. The scale is indicated above the diagram.
In designing a RACE strategy to
identify the 5` end of the AE2a mRNA (diagrammed in Fig. 5, bottom panel), we reasoned that a 14-nt palindromic sequence
at positions +53 to +66 might interfere with reverse
transcriptase activity. Therefore, the initial cDNA synthesis was
carried out using a P1 primer complementary to a region just 5` to the
palindromic sequence. A P-labeled P2 primer that extended
11 nt further 5` was used for the RACE reaction. Analysis of the
labeled RACE products on a polyacrylamide gel revealed the presence of
bands corresponding to products terminating between positions +1
and +20 (Fig. 5, right panel). The products from
the regions indicated by brackets and labeled 1-4 were isolated, reamplified, and subcloned. Sequence analysis
demonstrated that clusters 2 and 4 consisted of PCR artifacts, (
)whereas clusters 1 and 3 contained AE2 products with 5`
termini at positions ranging from nt -1 to +5.
To avoid missing any longer RACE products present in the region above cluster 1 and to obtain better quantitation of the frequency at which specific transcription initiation sites might be used, a piece of gel containing the region beginning with the bottom of cluster 3 and extending beyond nt -30 (bracketed region 5 in right panel of Fig. 5) was excised, and the DNA was eluted and reamplified. PCR products were subcloned, and colonies were identified using a probe corresponding to nt +7 to +22. Sequence analysis of 48 randomly selected positive clones showed that the majority of the RACE products (34/48) begin within a cluster at nt +1 to +4, corresponding to the major cluster identified by S1 nuclease protection, and a single clone began at nt +6. Most of the remaining clones (10/48) began at nt -1 (two clones), -3, or -4, which correspond to minor protected fragments identified by S1 nuclease protection, and several additional clones (3/48) began at positions -11 or -8. No products extending beyond this site were identified. These data and the data from S1 nuclease protection analysis indicate that there are no additional exons upstream of exon 1a, and demonstrate that the major site for initiation of the AE2a transcript is in the region designated nt +1 to +4 (see Fig. 7).
On the basis of the gene characterization and RACE experiments, it seemed likely that the transcription initiation sites for AE2b and AE2c were located in introns 2 and 5 of the AE2a transcription unit, respectively. S1 nuclease protection analysis of the AE2b mRNA using a probe derived from sequences in intron 2 revealed a prominent cluster of seven protected fragments (Fig. 6, left panel), with the largest fragment beginning at a G residue located 62 nt upstream of the donor splice site of exon 1b. Because this residue was also the 5`-most nucleotide identified in four of the seven AE2b RACE subclones analyzed, we conclude that it serves as the transcription initiation site. S1 nuclease analysis of the AE2c transcript confirmed that its transcription initiation site is located in intron 5. Two protected fragments were observed (Fig. 6, right panel), with their 5` ends corresponding to C and A residues located 211 and 209 nt upstream of the donor splice site of exon 1c. This result correlates well with the RACE analysis, in which the longest subclone extended 210 nt upstream of the donor site.
Figure 6: S1 nuclease analysis of the AE2b and AE2c transcription initiation sites. 5` end-labeled probes were hybridized with 2.5 or 5 µg of rat stomach mRNA or 5.0 µg of yeast tRNA, digested with 100 units of S1 nuclease, and analyzed by polyacrylamide gel electrophoresis and autoradiography. Left panel, analysis of AE2b. The ladder shown in the first four lanes was generated by chemical cleavage sequence analysis of the probe. Undigested probe is shown in the last lane. Nucleotides corresponding to protected fragments (sense strand) are indicated on the left. Right panel, analysis of AE2c. The ladder shown in the first four lanes was generated by chain termination sequence analysis of the region corresponding to the probe. Nucleotides (sense strand) corresponding to ends of the protected fragments are indicated on the left
One of these variants is AE2a, a 1234-amino acid protein encoded by a 4.4-kb mRNA that it is transcribed from the 5`-most promoter. Mapping of the transcription start site was hampered by palindromic sequences and the GC-richness of exon 1a and the 5`-flanking sequence. Nevertheless, S1 nuclease and RACE analyses show that the major site of initiation occurs at nt +1 to +4 ( Fig. 5and Fig. 7) and that a lower level of initiation occurs within the 11-nt sequence preceding this cluster. Some protection of the full-length S1 nuclease probe at nt -15 was observed; however, none of the 48 RACE products analyzed extended to this site. This suggests that the resistance of this region to S1 nuclease may have been due to formation of a hairpin secondary structure in the small portion of the probe not protected by the AE2a mRNA, although the possibility of a low level of transcription initiation occurring at these sites or further 5` cannot be ruled out. The 5`-flanking region of AE2a lacks the more common basal promoter elements, but does contain several CACCC sequences (22, 23) in both orientations and, in this respect, is similar to the 5`-flanking region of erythrocyte AE1(7, 28) . The AE2a promoter is active in most, if not all, mammalian tissues, in contrast to both the erythrocyte AE1 promoter(28) , which is highly tissue-specific, and the brain AE3 promoter, which is moderately tissue-specific(10) . The apparently ubiquitous expression of the AE2a variant (Fig. 8) suggests that it serves a housekeeping function in many cell types.
The second variant, AE2b, is a 1220-amino acid protein that contains an alternative 3-amino acid N-terminal sequence that replaces the first 17 amino acids of AE2a. The results of S1 nuclease analysis correlated well with the results of RACE analysis, and demonstrated that the 4.2-kb AE2b mRNA is transcribed from an alternative promoter located in intron 2. The sequence immediately surrounding the transcription initiation site closely matches the consensus CAP signal frequently occurring at initiation sites(21) . A potential CCAAT element occurs 65 nt upstream, which is within the preferred region for this element between nt -57 and -212(21) . Additional CCAAT sequences occur further upstream, but their functional significance is questionable as they are outside the preferred region for functional CCAAT elements. CACCC sequences and a potential Sp1 binding site were also observed. The AE2b mRNA is expressed in a more limited set of tissues than AE2a (Fig. 8), with the highest levels in stomach, consistent with the possibility that AE2b serves more specialized physiological functions than the ubiquitous AE2a.
The third variant, AE2c, is a 1035-amino acid protein that is identical to AE2a except that it lacks the first 199-amino acids. It is encoded by two mRNAs, 3.8 and 4.1 kb in length, that are transcribed from a promoter located in intron 5. The 4.1-kb mRNA, which retains the intron sequence between the donor splice site of exon 1c and the acceptor site of exon 6, contains multiple upstream open reading frames, which might interfere with translation of the long open reading frame encoding AE2c. Because of this, it is unclear whether it is a functional mRNA or an incompletely processed mRNA. A number of cases have been reported in which incomplete processing of mRNAs serves as a regulatory mechanism (reviewed in (29) ), and the existence of the 4.1-kb AE2c2 mRNA raises the possibility that this mechanism might be used in the regulation of AE2c expression. The results of S1 nuclease and RACE analysis showed that the AE2c transcription start site is located 209-211 nt upstream of the donor splice site of exon 1c. An AT-rich sequence that might serve as a TATA element begins at position -30 relative to the transcription initiation site. This is preceded by a potential CCAAT element at position -188, which is within the preferred region for such elements, and an inverted CACCC sequence at position -59. The AE2c promoter is highly tissue-specific, with expression of the AE2c mRNAs being observed only in stomach (Fig. 8), suggesting that this variant might serve a cell-type or organ-specific function.
Figure 9: Intron/exon organization, promoter usage and splicing patterns of the AE gene family. Patterns of promoter usage and splicing that lead to the generation of multiple AE1, AE2, and AE3 mRNAs are indicated. Exons are numbered below and shown as boxes, with hatched regions indicating untranslated sequences and open regions indicating coding sequences. Exons 1k of AE1 and 1c of AE3 are the alternative first exons for kidney AE1 and cardiac AE3. Exons 1a, 1b, and 1c are alternative first exons for the AE2 variants discussed in text. Transcription initiation sites (see Footnote 3) are indicated with arrows and translation initiation sites with ATG. Homologous exons of the AE2 and AE3 genes are aligned; exons 5-20 of AE1 and exons 8-23 of AE2 and AE3 are also homologous. Left column, AE1e, erythrocyte AE1 mRNA(4, 7, 19) ; AE1k1, kidney AE1 mRNA with alternative first exon spliced to exon 4, corresponding to mouse and human kidney mRNAs and a minor rat kidney mRNA(5, 7, 8) ; AE1k2, major rat kidney mRNA in which intron 3 sequences are retained(5, 7) . AE3b, mRNA encoding brain form of AE3, which is also expressed in some other tissues(9, 10) ; AE3c, mRNA encoding cardiac form of AE3(11, 12, 13) . AE2 mRNAs are described in text. Right column, number of codons in long open-reading frame of each mRNA.
When the AE2 (or AE3) exon sequences are aligned with those of the AE1 gene, the positions of the splice junctions for exons 8-23 of AE2 are identical to those of exons 5-20 of AE1, except that the junction between exons 13 and 14 of AE2 is shifted by a few codons relative to the junction between exons 10 and 11 of AE1. Based on the conservation of these junctions and the high degree of similarity between the amino acid sequences(10) , it is apparent that the last 16 exons of all three AE genes are homologous. However, there is little, if any, significant similarity between the first four exons of the AE1 gene and the first seven exons of the AE2 or AE3 genes. This suggests that the 5`-most exons of the AE1 gene may have had a separate evolutionary origin from the corresponding region of the AE2 and AE3 genes. One possibility is that the differences at the 5` ends of these genes might have arisen during the chromosomal rearrangement events that were responsible for dispersing these genes in the mammalian genome. For example, the current structure of either the AE1 gene or the AE2 and AE3 genes might have resulted from a chromosomal rearrangement in which the last 16 exons of an ancestral gene were recombined with the promoter and 5` coding exons of another gene.
The patterns of alternative promoter and exon usage of the AE2 gene are analogous to those of the AE1 and AE3 genes (Fig. 9), although the locations of the internal promoters and the consequent variations in protein structure are different. Transcription from the internal promoters leads either to a switch in N-terminal amino acid sequence or to a truncation of the protein, depending on whether the alternative first exon contains coding sequence or consists entirely of untranslated sequence. In the latter case an internal Met codon in a downstream exon serves as the translation initiation site. The internal promoters for AE2b and AE2c are located in introns 2 and 5 of the AE2a transcription unit, respectively, whereas the kidney AE1 promoter is located in intron 3 and the cardiac AE3 promoter is located in intron 6. It is clear from their positions within each gene that the promoters, and associated first exons, for AE2b, AE2c, and cardiac AE3 arose independently of each other during evolution, but it is less apparent whether the promoters and first exons for kidney AE1 and cardiac AE3 arose independently, as they occupy the same positions relative to the 16 highly conserved exons. However, the differences in tissue specificity and the absence of significant similarity between exon 4 of AE1 and exon 7 of AE3, which lie between the alternative first exon and the highly conserved exons, argue against the possibility that these promoters share a common evolutionary origin.
Because use of the alternative AE2 promoters leads to variations in N-terminal amino acid sequences, it is likely that they also serve as a mechanism for producing protein variants with altered, and physiologically relevant, functional characteristics. Generation of the AE2b and cardiac AE3 variants are analogous in that each involves a switch in the N-terminal amino acid sequence relative to the long forms of each exchanger. The extent of the sequence alterations, however, are quite different. The differences between AE2a and AE2b are restricted to the extreme N-terminal sequence encoded by the first coding exon. In contrast, transcription from the cardiac AE3 promoter results in the replacement of a 270-amino acid N-terminal sequence of brain AE3, which is encoded by five exons, with an alternative 73-amino acid sequence encoded by the cardiac-specific first exon(11, 12, 13) . The 270-amino acid sequence that is eliminated contains a histidine-rich domain, numerous proline-rich regions, and domains consisting of stretches of acidic or basic residues that have counterparts in AE2a(10, 14) . Transcription from the AE2c promoter leads to production of an mRNA encoding an N-terminal truncated form of the exchanger in which Met codon 200 serves as the apparent initiation codon. In this respect the AE2c mRNA is analogous to the kidney AE1 mRNA, which also encodes an N-terminal truncated exchanger that utilizes an internal Met codon as a translation start site. At the protein level the differences between AE2c and AE2a are similar to the differences between cardiac AE3 and brain AE3. AE2c is approximately the same size as cardiac AE3, and the truncation of its N-terminal sequence removes the histidine-rich domain (residues 74-88), the proline-rich regions, and the stretches of acidic (residues 122-130) and basic amino acids (residues 94-109) that are homologous to those eliminated in cardiac AE3. Although the functions of these domains have not been determined, it is likely that they play a role in regulating the activity of the exchanger.
Replacement of the 17-amino acid N terminus of AE2a with the 3-amino
acid N terminus of AE2b is a relatively limited sequence alteration
compared with those seen in the other AE variants. However, the unique
N terminus of AE2a contains a potential phosphorylation site for
cAMP-dependent protein kinase at serine 10. An increase in cAMP is
known to inhibit the absorption of NaCl across the apical membranes of
intestinal epithelial cells(30) . This process involves the
coupled activities of a
Cl/HCO
exchanger, which
appears to be the AE2a variant(16) , and a
Na
/H
exchanger (16) that is
thought to be NHE3(31) . Although we are unaware of any
evidence demonstrating that cAMP causes a direct inhibition of AE2 in
intestine, cAMP has been shown to inhibit
Cl
/HCO
exchange in
osteoblasts(32) . The identity of the exchanger in osteoblasts
was not determined, but it is conceivable that it is the ubiquitous
AE2a. In stomach, secretion of acid is stimulated by cAMP(33) ,
but cAMP does not alter the transport capacity of the basolateral
Cl
/HCO
exchanger in the
parietal cell (34) , which is known to be a variant of
AE2(18) . If Ser-10 of AE2a does serve as an inhibitory
phosphorylation site, then the elimination of this site in AE2b and
AE2c could serve an important function during acid secretion.
A
second possibility that should be considered is that the variant
N-terminal sequences contain sorting signals or cytoskeletal attachment
sites that influence the membrane location of the exchanger in
polarized epithelial cells. Such a possibility is suggested by the
recent demonstration that some variants of chicken AE1 are sorted to
the plasma membrane when expressed in human erythroleukemia cells,
whereas other variants, which differ in their N-terminal sequences, are
retained in intracellular membranes(27) . Immunolocalization
studies have shown that AE2 is present on apical membranes of villus
and crypt enterocytes in ileum (16) and on basolateral
membranes of gastric parietal cells(18) . The variant that was
cloned from ileum is AE2a (16) , and previous Northern blot
data (10) suggest that in ileum the level of AE2a mRNA is
greater than that of AE2b, consistent with the possibility that AE2a is
the apical exchanger in ileum. Likewise, AE2b and AE2c mRNAs are
abundant in stomach, suggesting that one or both of these variants
might mediate Cl/HCO
exchange across the basolateral membrane of the parietal cell.
There is at least one known example in which the use of alternative
promoters leads to an alteration in the localization of the encoded
proteins. It has been shown that leukemia inhibitory factor can exist
as a diffusible form or as an extracellular matrix-associated form (35) , and that the two proteins, which contain only slightly
different N-terminal sequences (MKVLAAG for the diffusible form and
MRCR for the matrix-associated form), are encoded by mRNAs that are
transcribed from alternative promoters. Thus, it seems reasonable to
speculate that the unique 17-amino acid N-terminal sequence of AE2a
might localize this variant to the apical membrane in certain polarized
epithelial cells, and that elimination of this sequence, as in AE2b and
AE2c, might result in sorting to the basolateral membrane. If future
experiments prove this hypothesis to be correct, it would not rule out
the possibility that the same variant can be sorted to different
membranes depending on the cell-type in which it is expressed, but it
would provide an explanation of at least one mechanism by which AE2 can
be localized to either apical or basolateral membranes. Also, it would
implicate AE2b as a possible candidate for the
Cl/HCO
exchanger that
functions on the basolateral membranes of epithelial cells of renal
proximal tubules (36) , the thick ascending limb(37) ,
and
-intercalated cells of the cortical collecting
duct(38) .
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U45885[GenBank]-U45887[GenBank].