(Received for publication, November 5, 1996, and in revised form, January 16, 1997)
From the Departments of Pharmacology and Internal
Medicine, University of Iowa College of Medicine,
Iowa City, Iowa 52242
Pituitary adenylate cyclase-activating
polypeptide (PACAP) elicits its diverse biological actions by
interacting with both PACAP-selective type I PACAP receptors (PACAPRs)
and type II PACAPRs that do not distinguish between PACAP and
vasoactive intestinal polypeptide. Using long distance polymerase chain
reaction, we amplified and characterized the entire coding region of
the rat type I PACAPR (rPACAPR) gene, which spans 40 kilobases and
contains 15 exons. Mapping of the exons and sequencing of all
intron-exon boundaries revealed a structural organization of the
rPACAPR gene that is very similar to those encoding other members of
the calcitonin/secretin/parathyroid hormone receptor family. Southern
blot analysis demonstrated a single copy of the rPACAPR gene. A
combination of rapid amplification of cDNA ends and reverse
transcriptase polymerase chain reaction revealed an unexpected
diversity in the rPACAPR mRNA in the 5-untranslated (5
-UTR)
region. Four rPACAPR cDNAs were identified with 5
-UTR sequences
that all diverged from the genomic sequence at a site 76 bp upstream of
the ATG start codon, where a consensus 3
slice acceptor sequence was
located. Sequence analysis of these amplified transcripts demonstrated
that they arise by tissue-specific differential usage of four exons in
the 5
noncoding region of the rPACAPR gene. This study is the first to
elucidate the structural organization of a PACAPR gene and to
demonstrate that alternative splicing generates rPACAPR transcripts
with unique 5
-UTRs.
In 1989, Arimura and colleagues (1) discovered a novel bioactive peptide in their attempt to identify new hypothalamic hormones regulating anterior pituitary hormone secretion. This peptide was named pituitary adenylate cyclase-activating polypeptide (PACAP),1 reflecting its potent ability to stimulate increases in cAMP in cultured rat anterior pituitary cells. Structural studies revealed the peptide to be a C-terminal amidated 38-amino acid peptide (PACAP-38), and subsequent studies resulted in identification of the other molecular form of PACAP (PACAP-27) (2), formed by proteolytic processing of the encoded PACAP precursor protein. PACAP is a member of the VIP/secretin/glucagon/GRF family of neuropeptides. Immunochemical studies have revealed the existence of PACAP-containing nerve fibers throughout the central and peripheral nervous system, and radioimmunoassay results have demonstrated a broad distribution and range of tissue concentrations of the two molecular forms of PACAP (3-5).
PACAP possesses an impressive array of biological actions consistent with its diverse tissue distribution and suggested roles as hypophysiotropic hormone, neurotransmitter, neuromodulator and vasoregulator (5, 6). PACAP stimulates hormone release from various cells (1, 7-13) and is the most potent insulin secretagogue yet described (14). PACAP stimulates neurite outgrowth in pheochromocytoma PC-12 cells (15), promotes mitogenesis and survival of cultured rat sympathetic neuroblasts (16, 17), and prevents neuronal cell death induced by human immunodeficiency virus protein gp120 in dissociated hippocampal cultures (5). PACAP dilates various vessels (5, 18), induces hypotension (1, 19), stimulates steroidogenesis (20), and stimulates hepatic glycogenolysis (21). A multifunctional role of PACAP is suggested further by the presence of dense networks of PACAP fibers and/or high affinity binding sites for PACAP in these target tissues (reviewed in Ref. 5).
PACAP produces its biological effects by interacting with at least two types of high affinity receptors, type I PACAP-preferring receptors and type II receptors, which do not distinguish between PACAP and VIP (5, 22, 23). Recombinant type II receptors (24, 25) bind VIP and PACAP with equal affinity and activate adenylyl cyclase. Based upon their pharmacological properties and tissue distribution, these receptors appear to represent the known VIP receptor (i.e. the receptor that has shared ligand specificity for PACAP). The type I PACAPR, cloned by several laboratories (26-31), binds PACAP with an affinity approximately 1000 times higher than VIP and activates adenylyl cyclase. Spengler et al. (30) described five splice variants of this receptor differing in the region corresponding to the third intracellular loop of the receptor. Four of these five splice variants exhibited the multifunctional signaling capability of PACAP receptors described in several cell types (15, 32, 33), activating both adenylyl cyclase and phosphoinositide phospholipase C. Structurally, the PACAPR exhibits sequence homology to the new family of G protein-coupled receptors first identified by cloning of secretin, calcitonin, and PTH receptors (34-36).
The understanding of the mechanisms involved in PACAPR expression at
transcriptional and translational levels and the physiologic role of
the PACAPR gene would be facilitated greatly by elucidation of the
structure and processing of its gene. In this report, we describe the
complete structural organization of the coding region of the rPACAPR
gene. We demonstrate that this gene consists of at least 15 exons and
14 introns with an intron-exon structure highly conserved among other
known genes in the secretin/calcitonin/PTH family of G protein-coupled
receptors. We also provide the first evidence for tissue-specific
differential splicing in the 5-UTR of the rPACAPR mRNA generating
rPACAPR transcripts with unexpected diversity in the 5
-UTR.
Overlapping fragments of genomic DNA encompassing the entire
coding sequence of the rPACAPR gene (Fig. 1) were amplified from rat
genomic DNA (CLONTECH) by long distance PCR using
Elongase (Life Technologies, Inc.) and a series of forward and reverse primers to rPACAPR cDNA sequences. The routine PCR mixture (50 µl) contained 60 mM Tris-SO4, pH 9.1, 18 mM (NH4)2SO4, 2 mM MgSO4, 0.2 mM dNTPs, 0.2 µM primers, and 2 units of Elongase enzyme mix. Amplification was performed for 35-40 cycles with denaturation at
94 °C for 30 s followed by annealing/extension at 68 °C for 5-15 min. Automated fluorescent dideoxy sequencing of the purified PCR
products was performed at the University of Iowa DNA Core Facility.
PCR Amplification of the ATG 5
The ATG 5-flanking sequence of the rPACAPR gene was
PCR-amplified using the Rat Promoter Finder kit
(CLONTECH). Briefly, the supplied rat genomic DNA,
digested with various restriction enzymes and ligated at both 5
and 3
ends with an adaptor DNA of known sequence, served as templates in PCR
using a forward primer to the adaptor sequence and a reverse primer to
sequences unique to the rPACAPR cDNA. The 5
-flanking sequence was
obtained by automated fluorescent dideoxy sequencing of the purified
PCR products (University of Iowa DNA Core Facility).
The sizes and locations of introns within genomic DNA PCR products were determined by a combination of agarose gel electrophoresis and DNA sequencing. By comparison of the sequence of the amplified genomic DNA to that of the rPACAPR cDNA we were able to confirm the rPACAPR cDNA sequence and to identify intron-exon boundaries and thereby also determine the sizes of the introns and exons.
Restriction mapping of the rPACAPR gene was performed by digestion of the overlapping genomic DNA PCR products with EcoRI, BamHI, BglII, HindIII, and SacI followed by agarose gel electrophoresis.
Genomic Southern Blot AnalysisRat genomic DNA (15 µg)
was digested with restriction enzymes EcoRI,
BamHI, BglII, HindIII, and
SacI, electrophoresed on a 0.8% agarose gel, and blotted to
nitrocellulose membrane. The membrane was hybridized with a 767-bp
cDNA probe (nucleotides 185-951 of the rPACAPR cDNA) (Fig. 1)
labeled with [-32P]CTP (800 Ci/mmol, Amersham Corp.)
by the random priming method. Hybridization was performed overnight at
65 °C in 1 × SSC, 5 × Denhardt's solution, 0.1% SDS,
50 µg/ml salmon sperm DNA, 10% dextran sulfate and then washed with
0.1 × SSC, 0.1% SDS at 65 °C. Autoradiography was performed
for 4 days at
70 °C.
Oligo(dT)-primed cDNA from rat spinal cord was
ligated to a synthetic oligonucleotide of known sequence at its 3 end
essentially as we described (37). This anchor-ligated cDNA served
as the template for PCR amplification of its 3
end (i.e.
corresponding to the 5
mRNA end) using a forward primer
complementary to the anchor sequence and a reverse primer specific to
sequences of the rPACAPR cDNA. Two rounds of seminested PCR
(i.e. with different reverse primers and the same forward
primer) resulted in amplification of products ranging in size from
approximately 250 to 900 bp. The reverse primer used in the final round
of PCR corresponded to nucleotides 783-807 of the rPACAPR cDNA,
suggesting that the largest PCR product extended approximately 100 bp
into the 5
-UTR region of the rPACAPR cDNA. PCR products were
cloned, and multiple colonies were selected for sequencing.
Oligo(dT)-primed cDNA from various
rat tissues or cells (pancreatic -islets) were amplified in two
successive rounds of seminested PCR using a forward primer
corresponding to a sequence found within the 5
-UTR
(GTCTGGACCGGCCCGGAGACCAG, obtained by anchored PCR) of the rPACAPR
transcript. In the first round of PCR, this forward primer was used in
combination with a reverse primer corresponding to nucleotides 292-325
of the rPACAPR cDNA. In the final round of PCR, the same forward
primer was used in combination with a nested reverse primer
corresponding to nucleotides 230-262 of the rPACAPR cDNA.
PCR-amplified products were resolved by agarose gel (2%)
electrophoresis, and selected products were extracted from the gel,
purified, and subjected to DNA sequencing.
Poly(A)+ RNA was
isolated from rat cerebellum and rat cerebral cortex using a commercial
kit (Invitrogen). Poly(A)+ RNA from rat cerebral cortex (7 µg) and rat cerebellum (5 µg) and yeast tRNA (30 µg) were reverse
transcribed with avian myeloblastosis reverse transcriptase using a
32P-end-labeled oligonucleotide (1 × 105
cpm) corresponding to nucleotides 1 to
37 of the rPACAPR cDNA. Reaction products were separated on a denaturing 9% polyacrylamide gel
using the sequence of the rPACAPR cDNA as a size standard.
To determine the structure of
the coding sequence of the rPACAPR gene, we performed long distance PCR
with rat genomic DNA and primers to rPACAPR cDNA sequences to
systematically walk up the entire coding sequence of the rPACAPR gene.
Amplification by Taq polymerase is limited generally to DNA
templates less than 5 kb because of its inability to correct
misincorporations via 3 to 5
exonuclease activity. To accomplish
amplification of long segments of genomic DNA and to assure the
fidelity of amplification, we used a commercially available mixture
(Elongase) of Taq polymerase and Pyrococcus sp.
GB-D DNA polymerase, the latter exhibiting 3
to 5
exonuclease
activity. Using this strategy, we successfully amplified six
overlapping genomic DNA segments ranging in size from 1.1 to 11.9 kb
that spanned the entire coding sequence of the rPACAPR gene (Fig.
1). By sequencing these PCR products, we were able to
confirm the rPACAPR cDNA sequence and identify the locations and
sizes of introns. Alignment of these genomic PCR products demonstrated
that the coding region of the rPACAPR gene spans 40 kb of DNA and is
interrupted by 14 introns. The intron-exon organization of the rPACAPR
gene in relation to its mRNA is shown in Fig. 1.
Table I shows the sizes of introns and the intron-exon splice junction sequences in the coding region of the rPACAPR gene. As shown, the introns vary in size from 320 bp to 10.5 kb, and all of the splice acceptor and donor sequences agree with the GT/AG consensus sequence (38). The rPACAPR gene intron splice phasing is type 0 (the intron occurs between codons) for introns 1, 6, 8, 11, 13, and 14; type 1 (the intron interrupts the first and second base of the codon) for introns 2, 3, 4, 5, and 9; and type 2 (the intron interrupts the second and third base of the codon) for introns 7, 10, and 12.
|
The relationship between exon and intron locations to proposed
structural domains of the rPACAPR is shown in Fig. 2.
Exons range in size from 21 bp (exon 4) to more than 185 bp (exon 15). The N-terminal extracellular domain of the receptor is encoded by the
first six exons and part of the seventh exon. Transmembrane domains I,
II, III, and VI are entirely within exons 7, 8, 9, and 13, respectively, while transmembrane domains IV, V, and VII are each
interrupted by a single intron. Intracellular domains 1, 2, and 4 are
either intronless (2i) or have a single intron located one amino acid
from the intracellular domain/transmembrane domain junction (1i and
4i). However, the third intracellular domain (3i) is interrupted by an
intron located four amino acids from transmembrane VI. This intron
(intron 12) is the largest intron present in the coding sequence of the
rPACAPR and is located precisely where alternative splicing can occur
to produce the third intracellular loop splice variant forms of the
PACAPR (30). Indeed, we identified alternate 3 splice acceptor sites
located 2.8 and 7.7 kb from the 5
end of intron 12 that represent the splicing sites used to produce the hip and hop cassette forms of the
rPACAPR, respectively, described by these workers. Introns interrupt
two of the three extracellular loops of the rPACAPR. Thus, the regions
of the rPACAPR gene coding for the intracellular domains of the
receptor are largely intronless compared with the regions coding for
the extracellular receptor domains. Also, four of the seven
transmembrane domains of the rPACAPR are encoded by individual
exons.
Genomic Southern Blot Analysis of the rPACAPR Gene
Southern
blot analysis of rat genomic DNA digested by EcoRI,
BamHI, BglII, HindIII, or
SacI was performed using a cDNA probe covering exons
3-11 of the rPACAPR gene (nucleotides 185-951 of the rPACAPR
cDNA) (Fig. 1). This region of the rPACAPR gene was mapped entirely
with each of these restriction enzymes, although Fig. 1 shows only the
EcoRI, BamHI, and HindIII sites for
purposes of clarity. Fig. 3 shows the results of the
Southern blot, and Table II shows the predicted and
actual sizes of the fragments recognized by the cDNA probe. As
shown, the observed patterns of hybridization were consistent with the
restriction sites we mapped in the rPACAPR gene coding sequence. These
results confirm the structural organization of the rPACAPR gene. They
indicate also that the rPACAPR is encoded by a single gene and that no pseudogenes are present in the rat genome.
|
Nearly 1.2 kb of the ATG 5-flanking sequence
(GenBankTM accession number U82669[GenBank]) of the rPACAPR gene was
PCR amplified from restriction enzyme-digested, anchor-ligated genomic
DNA. Comparison of 5
-UTR sequences of rPACAPR cDNAs to this ATG
5
-flanking sequence of the rPACAPR gene revealed that the region
upstream of nucleotide
76 represents part of an intron (4 kb) that is located between the 3
splice acceptor site and exon 2 of the 5
-UTR of
the rPACAPR gene (see below).
To compare the
5-flanking sequence of the rPACAPR gene to the 5
-UTR of the rPACAPR
cDNA, we performed 5
-RACE with rat spinal cord cDNA to amplify
the 5
ends of the rPACAPR cDNA. Using this approach, we amplified
and cloned two cDNAs with 5
-UTR sequences beginning
81 and
102
nucleotides from the translation start site, which we named cDNA 1 and cDNA 2, respectively. As shown in Fig. 4,
cDNA 2 was identical to cDNA 1 in its 5
-UTR sequence up to
nucleotide
79, where it then diverged from cDNA 1 over the next
two bases and possessed an additional 21 bp of 5
sequence. Interestingly, both cDNA 1 and cDNA 2 are identical to the
genomic DNA sequence up to nucleotide
79, whereupon they both
diverge. In view of these findings, it is particularly noteworthy that there is a consensus splice acceptor sequence located at nucleotide
77 in the 5
-flanking sequence of the rPACAPR gene. The observed divergence in sequence of cDNA 1 and cDNA 2 from each other and from the genomic DNA sequence as well as the existence of a splice acceptor site in this region suggested alternative splicing of the
rPACAPR gene in the 5
noncoding region.
We obtained further evidence for alternative splicing of the rPACAPR
gene in the 5 noncoding region by amplifying two additional rPACAPR
cDNAs with unique 5
-UTR sequences. Because cDNA 1 diverged from cDNA 2 by only 2 bp at its 5
end, we considered the
possibility that it represented a truncated form of cDNA 2 and that
the 2-bp divergence may have resulted from a PCR artifact, 5
anchor
ligation artifact, or cloning artifact, although five individual clones exhibited the identical sequence of cDNA 1. To examine this
possibility, we performed PCR using the unique sequence at the 5
end
of cDNA 2 as a forward primer, rather than the anchor primer, with
reverse primers to sequences present in the coding region of the
rPACAPR cDNA. Seminested PCR was performed using reverse primers to
nucleotides 292-325 and 230-262 of the rPACAPR cDNA in the first
and second rounds of PCR, respectively, with direct sequencing of the
PCR products. We used rat cerebral cortex and rat liver cDNA as
templates in these reactions. We gel-purified the major PCR product
amplified from cerebral cortex cDNA and the largest PCR product
amplified from rat liver cDNA. Unexpectedly, we amplified two
additional rPACAPR cDNAs with unique 5
-UTR sequences that we named
cDNA 3 (liver) and cDNA 4 (cerebral cortex).
Alignment of cDNAs 1-4, as shown in Fig. 4, reveals the existence
of a unique pattern of alternative splicing in the noncoding region of
the rPACAPR gene. As shown, cDNA 3 is identical to cDNA 4 except for the presence of an intervening sequence of 94 nucleotides. This intervening sequence begins precisely at the putative splice junction sequence (nucleotide 77) that we identified in the
5
-flanking sequence of the rPACAPR gene. This strongly suggests that
this sequence represents a 3
splice acceptor site. None of the
cDNAs match the genomic sequence beyond nucleotide
79, further
supporting alternative splicing of the rPACAPR transcript to produce
mRNAs with unique 5
-UTRs.
We performed PCR using genomic DNA as template to identify the genomic
locations and intron-exon junction sequences encoding the 5-UTRs of
cDNAs 2-4. As described above, we amplified 5 kb of 5
-flanking
sequence of the rPACAPR gene. Although we did not sequence this entire
segment of 5
-flanking DNA, we sequenced nearly 1.5 kb from each end,
and we were unable to find a noncoding exon matching the 23-nucleotide
sequence present at the 5
ends of cDNAs 2-4. We hypothesized that
this 23-nucleotide sequence was upstream of this region, and we
performed PCR using the 23-nucleotide sequence as a forward primer with
a reverse primer to a sequence 3.5 kb upstream of the translation start
site (i.e.
3.5 kb). Using this approach, we amplified a
single product of 4.5 kb from rat genomic DNA, suggesting that the
23-nucleotide sequence in question is present in the genomic DNA
approximately 8 kb upstream of the translation start site. Sequence
analysis of this PCR-amplified genomic DNA revealed the presence of a
5
splice donor sequence (aag) at the 3
end of the
conserved 26-bp sequence found in cDNAs 2-4. We used a similar
approach to locate the sequence shared by cDNAs 3 and 4, which is
contiguous with the 3
end of this conserved 26-bp sequence. We found
that this shared sequence is located approximately 3 kb downstream of
the 26-bp sequence in the genomic DNA, and we identified a splice
acceptor sequence (atg) at its 5
end and a splice donor
sequence (aag) at its 3
end. In addition, we identified
the unique sequence of cDNA 3 800 bp downstream of the shared
sequence in the genomic DNA with a splice acceptor sequence
(cat) at its 5
end and a splice donor sequence
(aag) at its 3
end. These findings indicate splicing of
intronic sequences to generate cDNAs 2-4.
Therefore, our results suggest that there exist at least four exons
(exons 1-4) upstream of the consensus splice acceptor site located 5
to nucleotide
76. Exon 1 is located approximately 3 kb upstream of
exon 2 and 8 kb upstream of the translation start site, while exon 3 is
located approximately 800 bp downstream of exon 2. The location of exon
4 in relation to these three exons, however, is not known. Based on
these findings, we proposed a scheme (Fig. 5) to account
for the generation of rPACAPR transcripts with unique 5
-UTRs.
According to this scheme, cDNA 3 is generated by splicing together
exons 1, 2, and 3 to the acceptor sequence at
76. Skipping exon 3 produces cDNA 4, while skipping both exon 2 and exon 3 generates
cDNA 2. Similarly, cDNA 1 results from splicing exon 4 with the
acceptor sequence (also see "Discussion").
Tissue-specific Generation of 5
Although cDNAs 2-4 have unique 5-UTRs, they share
sequence derived from exon 1 (Fig. 5). This enabled us to examine
whether these unique PACAPR transcripts are expressed in a
tissue-specific fashion by analysis of products amplified from various
tissues by RT-PCR. We performed seminested PCR using a forward primer corresponding to sequences present in exon 1 with nested reverse primers corresponding to coding sequences of the rPACAPR cDNA. This
approach enabled us to amplify specifically rPACAPR transcripts retaining exon 1 at their 5
ends. As shown in Fig. 6,
several products were amplified from most tissues examined. These
products ranged in size from approximately 350 to 575 bp, although
several larger products were present in kidney and no amplification
products were apparent in pancreatic
-islets. The three major
products present in most of the tissues examined, although in different relative amounts in different tissues, were gel-purified and sequenced. The sizes of these products corresponded exactly to the predicted products expected for cDNAs 2-4, and sequence analysis of these three products, labeled 574, 480, and 360 bp in Fig. 6, confirmed their identity to cDNAs 3, 4, and 2, respectively. Because the PCR conditions were not biased for amplifying
any one of these three cDNAs (i.e. each cDNA was
amplified with the same combination of primers), the relative amounts
of the PCR products corresponding to cDNAs 2-4 would be expected
to be proportional to their tissue abundance. Fig. 6 shows clear
differences in the PCR products corresponding to these three cDNAs.
For instance, the 574-bp product (cDNA 3) appears to be a major
transcript in liver and lung but not in brain. Although we have
amplified coding sequences of the rPACAPR from pancreatic
-islet
cells (not shown), the absence of products corresponding to cDNAs
2-4 in this tissue suggests that exon 1 is not retained in PACAPR
transcripts in these cells. Together, these results demonstrate
differential splicing of the rPACAPR transcript to produce rPACAPR
mRNAs with unique 5
-UTRs.
Fig. 6 also shows amplification of an additional 400-bp product in several tissues and of products larger than 574 bp in kidney. Whether these products represent additional splice variants of the rPACAPR that utilize an exon(s) not present in cDNAs 2-4 remains to be determined.
Identification of the Transcription Start SiteFig.
7 shows transcription start site mapping by primer
extension on rat cerebral cortex and cerebellum poly(A)+
RNA. An antisense primer corresponding to nucleotides 1 to
36 of
the sequence present in cDNAs 1-4 (Fig. 4) was used in the primer
extension to reveal multiple products ranging from approximately 80 to
107 nucleotides. Because the extension primer we used began at
nucleotide
1, the lengths of the extension products correspond to the
position of the transcription start site relative to the translation
start site. The major extension product was 95-97 nucleotides in
length with minor extension products of 105-107 and 82 nucleotides in
length. These same bands were present in the cerebellum samples
(although not as apparent in this autoradiogram) but were not present
in the tRNA sample even upon overexposure of the autoradiogram. The
products less than 75 nucleotides in length were nonspecific, since
they were observed in the tRNA lane in other experiments.
The lengths of the 5-UTR sequence of the two cDNAs we amplified by
5
RACE (i.e. cDNA 1, 81 nucleotides; cDNA 2, 102 nucleotides) are in fairly good agreement with the extension products
of the 82 and 105-107 nucleotides. Although Fig. 7 shows sequences
only up to approximately 190 nucleotides in length, no extension
products of up to 310 nucleotides in length were observed. Thus, no
extension products were found with lengths comparable with the lengths
of the 5
-UTR of cDNA 3 (212 bp) and cDNA 4 (301 bp). In view
of these findings and the fact that these two cDNAs were not
amplified by 5
RACE, it is likely that their 5
sequence is
incomplete. Since cDNA 2 shares the same 5
sequence with cDNAs
3 and 4, it is possible that it too represents an incomplete transcript
(see "Discussion").
In the present study, we have elucidated the structure of the
coding region of the rPACAPR gene, including its ATG 5-flanking region, and provided evidence for tissue-specific differential splicing
of the rPACAPR mRNA in the 5
-UTR. The coding region of the rPACAPR
gene is composed of 15 exons and spans 40 kb of genomic DNA. The first
exon, encompassing the translation start site, extends into the 5
noncoding region of the rPACAPR gene and together with exons 2-7
encodes the large N-terminal extracellular domain of the receptor.
Exons 7-14 encode the seven transmembrane domains and associated
intra- and extracellular loops, and exon 15 encodes the C-terminal
cytoplasmic tail of the receptor and extends into the 3
noncoding
region of the rPACAPR gene.
The intron-exon organization of the rPACAPR gene is strikingly similar
to that of genes encoding other members of the group III family of G
protein-coupled receptors (i.e. the secretin/calcitonin/PTH receptor family). Elucidation of the gene structure of the porcine calcitonin receptor (pCTR) (39) in 1994 first revealed the unique intron-exon organization of a receptor in this group of G
protein-coupled receptors. By comparing the pCTR gene structure to
regions of the mouse growth hormone-releasing factor receptor (mGRFR)
reported to be coded for by different exons, these authors suggested a common ancestral origin of these two receptor genes. Recent studies (40, 41) elucidated the gene structures of two additional members of
this receptor family, the rat corticotropin-releasing factor receptor
(rCRFR) and mouse parathyroid hormone receptor (mPTHR). Fig.
8 shows the alignment of the rPACAPR exon-intron organization with that of these four receptors. As shown, there is a
remarkable conservation of intron-exon junctions in these five distinct
receptors from three different species. Exons encoding the
transmembrane domains and intracellular regions of these receptors (i.e. following intron 6) have an intron-exon organization
that is extremely well conserved among these receptors, whereas some variability in gene structure is apparent in exons encoding the extracellular N-terminal region. This latter finding may reflect evolution of a primordial gene to produce receptors with different ligand specificities in view of evidence indicating that the N-terminal cytoplasmic region of several members of the group III family of G
protein-coupled receptors, i.e. the glucagon, secretin, VIP, and PTH receptors, represents the ligand binding domain of these receptors (42-44).
Genes encoding G protein-coupled receptors have been found to have
either no introns (e.g. 2- and
-adrenergic) (45, 46), introns only in noncoding regions
(e.g. platelet-activating factor, bradykinin B2) (47, 48),
introns only in coding regions (e.g. substance K, substance
P) (49, 50), or introns in both coding and noncoding regions
(neuropeptide Y Y1, endothelin A) (51, 52). However, apart from the
luteinizing hormone receptor gene, which has 10 introns all located
within the N-terminal extracellular domain, six or fewer introns are
present in the coding regions of the G protein-coupled receptor genes
described to date (49, 52). Moreover, none of these genes have the
characteristic interruption of transmembrane domains IV, V, and VII by
introns as found in the rPACAPR, pCTR, mGRFR, rCRFR, and mPTHR genes.
Thus, the complex intron-exon organization of the rPACAPR, pCTR, mGRFR,
rCRFR, and mPTHR genes appears to be unique to the group III family of
G protein-coupled receptors.
We also provide the first evidence for alternative splicing of the
PACAPR mRNA in its 5-UTR. Using a combination of 5
-RACE and
RT-PCR, we identified four rPACAPR cDNAs with unique 5
-UTR sequences. All four cDNAs are identical to each other and to the 5
genomic sequence up to nucleotide
76, where we identified a splice
acceptor sequence 5
to this nucleotide. Three of the four cDNAs
differ from the shortest cDNA (cDNA 1) by the presence of a
conserved 23-bp sequence, located at different upstream positions for
each cDNA. Our results show that cDNAs 2-4 are formed by
alternative splicing of three exons, located approximately 8 kb (exon
1), 5 kb (exon 2), and 4.2 kb (exon 3) upstream of the translation start site, to the splice acceptor site located 5
to nucleotide
76
(Fig. 5). One of these cDNAs (cDNA 3) is formed by splicing together of exons 1, 2, and 3 to the acceptor sequence, while the other
two cDNAs are produced by skipping of one (cDNA 4) or two of
these exons (cDNA 2). In contrast to the sharing of exons by
cDNAs 2-4, cDNA 1 is produced by splicing to an exon
containing minimally GGCAG. These findings demonstrate that splicing
occurs at several locations in the 5
-UTR of the rPACAPR mRNA to
generate transcripts with unique 5
-UTR sequences.
In view of these findings, it was of great interest to compare the
sequence of cDNA 1-4 and the 5-flanking sequence of the rPACAPR
gene described here to the 5
-UTRs included in reports describing the
cloning of the rPACAPR cDNA. The five groups that included
sequences from the 5
-UTR region of cloned rPACAPR cDNAs reported
five different lengths of this sequence, i.e. 30, 76, 90, 335, and 396 nucleotides (26-30). Each of these sequences matches the
genomic sequence described here up to nucleotide
76. The three
reported sequences that extend beyond this location all diverge at this
point (28-30) but match exactly the sequence of cDNA 2 reported
here. cDNA 2 is the cDNA in which exon 1 is spliced directly to
the 3
acceptor sequence (Fig. 5), and our results suggest that this
exon is located 8 kb upstream of the translation start site. These
findings confirm that the 3
splice acceptor site we identified here
does, in fact, represent a splicing site and that splicing of the 5
noncoding region of the rPACAPR gene to produce the 5
-UTR of cDNA
2 occurs in at least two other tissues (colliculi, olfactory bulb)
(28-30). The finding that two of these sequences extend beyond the 5
end of cDNA 2 and match each other beyond this point is consistent
with our suggestion, based upon primer extension data, that cDNAs
2-4 are incomplete at their 5
ends.
Our finding that alternative splicing of the rPACAPR mRNA can
produce rPACAPR transcripts with heterogeneous 5-UTR sequences prompted us to examine the tissue-specific nature of this phenomenon. RT-PCR analysis demonstrated differential splicing of the rPACAPR transcript to produce rPACAPR mRNAs with heterogeneous 5
-UTRs in
nearly every tissue we examined. Three major products, shown by
sequence analysis to represent cDNAs 2-4, were amplified from every tissue except pancreas. Thus, splicing in the 5
-UTR of the
rPACAPR mRNA according to the scheme shown in Fig. 5 represents the
molecular basis for formation of these heterogeneous rPACAPR transcripts. The relative amounts of these three rPACAPR transcripts, PCR-amplified using the same primer combinations, varied in different tissues, indicating differential expression of these three rPACAPR transcripts in different tissues. Further experiments will be required
to determine whether the 440-bp product amplified in several tissues or
the additional products amplified in kidney arise from differential
splicing of exon 1 to additional exons in the 5
-UTR of the rPACAPR
transcript.
At present it is unclear whether splicing of the rPACAPR mRNA in
the 5-UTR represents an important regulatory mechanism involved in
tissue- or cell-specific expression, mRNA stability, or mRNA translatability as found in other genes (53-56). Alternative splicing of 5
noncoding regions of genes for other G protein-coupled receptors, including human neuropeptide Y Y1 and A1 adenosine receptors (52, 57),
seems to be related to the tissue-specific expression of their
transcripts. These findings are consistent with the tissue-specific expression of differentially spliced rPACAPR transcripts observed in
the present study. Presently, it is unknown whether transcripts with
alternate 5
-UTR sequences are a common feature among group III G
protein-coupled receptors, although there is evidence for alternative
splicing of the pCTR in its coding region (39). Spengler et
al. (30) demonstrated the existence of five rPACAPR cDNAs,
representing splice variants of the receptor in the third intracellular
loop region, a region shown here to be encoded by exon 12, where we
identified the location of these splicing sites. No dramatic
differences in the signaling activities of these coding region splice
variants were observed, since all receptors stimulated adenylyl cyclase
and all but one also stimulated phospholipase C. However, these
findings together with the present study show that the rPACAPR gene can
be alternatively spliced in both 5
noncoding regions and coding
regions.
Primer extension analysis demonstrated multiple transcription sites
located from 80 to 107 bp upstream of the translation start site, with
no other sites detected up to 310 bp from this site. These sites could
represent heterogeneity in the transcription start site for cDNA 1 and/or the possible existence of additional transcripts, but they are
insufficient in length to account for cDNAs 2-4. Indeed, these
results suggest that cDNAs 2-4 represent incomplete transcripts,
because no extension products corresponding to their sizes were
observed. We sequenced the 5-flanking region to 1164 bp upstream of
the translation start site. This region does not contain a splicing
site that can account for the generation of cDNA 1. Further studies
will be required to identify the 5
-flanking regions of cDNA 1 and
cDNA 4, although we do know that exon 1 is located 8 kb upstream of
the translation start site.
In summary, we used a novel PCR-based strategy to amplify the entire
coding region of the rPACAPR gene, determined its intron-exon organization, and provided evidence for differential splicing in the
5-UTR of the rPACAPR transcript. Southern analysis indicated a single
copy of this gene. The elucidation of the rPACAPR gene structure
establishes the foundation for the use of molecular genetic approaches
to further study the regulation of its transcription and splicing in
the 5
region and the role of this gene in physiology, particularly in
processes such as hormone secretion and neurotransmission.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U82669[GenBank], U84740[GenBank], U84741[GenBank], U84742[GenBank], and U84743[GenBank].