(Received for publication, September 8, 1994; and in revised form, December 2, 1994)
From the
The cDNA sequences encoding the central cannabinoid receptor, CB1, are known for two species, rat and human. However, little information concerning the flanking, noncoding regions is presently available. We have isolated two overlapping clones from a human lung cDNA library with CB1 cDNA inserts. One of these, cann7, contains a short stretch of the CB1 coding region and 4 kilobase pairs (kb) of the 3`-untranslated region (UTR), including two polyadenylation signals. The other, cann6, is identical to cann7 upstream from the first polyadenylation signal, and in addition, it contains the whole coding region and extends for 1.8 kb into the 5`-UTR. Comparison of cann6 with the published sequence (Gérard, C. M., Mollereau, C., Vassart, G., and Parmentier, M.(1991) Biochem. J. 279, 129-134) shows the coding regions to be identical, but reveals important differences in the flanking regions. Notably, the cann6 sequence appears to be that of an immature transcript, containing 1.8 kb of an intronic sequence in the 5`-UTR.
In addition, polymerase
chain reaction amplification of the CB1 coding region in the IM-9 cell
line cDNA resulted in two fragments, one containing the whole CB1
coding region and the second lacking a 167-base pair intron within the
sequence encoding the amino-terminal tail of the receptor. This
alternatively spliced form would translate to an
NH-terminal modified isoform (CB1A) of the receptor,
shorter than CB1 by 61 amino acids. In addition, the first 28 amino
acids of the putative truncated receptor are completely different from
those of CB1, containing more hydrophobic residues. Rat CB1 mRNA is
similarly alternatively spliced. A study of the distribution of the
human CB1 and CB1A mRNAs by reverse transcription-polymerase chain
reaction analysis showed the presence of both CB1 and CB1A throughout
the brain and in all the peripheral tissues examined, with CB1A being
present in amounts of up to 20% of CB1.
The pharmacological activities of cannabis, its most active
principle, -tetrahydrocannabinol, an endogenous
agonist, anandamide(1) , and numerous synthetic agonists are
mediated by the cannabinoid receptor. Partial cDNA sequences encoding
the receptor have been isolated from rat (2) and human (3) brain tissues. The deduced protein sequence of the human
receptor shows that it comprises 472 amino acids, with the seven highly
hydrophobic regions typical of the G protein-coupled receptor
superfamily. The rat receptor shares 97.3% identity with the human
receptor. This central form of the cannabinoid receptor, CB1, mediates
both the inhibition of adenylyl cyclase via a pertussis toxin-sensitive
GTP-binding regulatory protein (4) and the inhibition of
N-type calcium channels(5) . Although CB1 and its mRNA are
predominantly found in the
brain(2, 6, 7, 8, 9) , the
mRNA has also been detected in peripheral tissues (3, 10) by Northern analysis and by
PCR(
)(11) . PCR fragments within the coding region
of CB1 cDNA from the human testis (3) and from the myeloid cell
line U937 (11) were identical in sequence to the brain-derived
CB1 cDNA, suggesting that the peripheral and brain sequences were
identical. However, the cannabinoid receptor story became more
complicated with the subsequent cloning from the human promyelocytic
leukemic cell line HL-60 and from macrophages in the marginal zone of
human spleen of a distinct, exclusively peripheral receptor, designated
CB2(12) . CB2, although exhibiting only 51% identity to CB1 in
the transmembrane regions, strongly resembles the latter in its
interactions with various ligands and also mediates the inhibition of
adenylyl cyclase(13) . Many questions arise from this apparent
redundancy, among them the exact cellular localization and regulation
of the two receptors and of their transcripts. To start answering these
questions, we have undertaken the cloning of the full-length human CB1
cDNA. In the course of this investigation, we discovered the presence
of two introns in the CB1 gene, one in the 5`-UTR and the second in the
coding region of the receptor. Here we show that translation of the
fully matured coding region would lead to a truncated and
amino-terminal modified form of CB1, designated CB1A. Interestingly,
during expression studies of rat CB1 cDNA in a baculovirus expression
system, Pettit et al.(14) recently detected short
variants of CB1, which they attributed to protein degradation, but
which were more probably translation products of the CB1A mRNA.
Finally, using PCR we have established the relative distribution of the
CB1 and CB1A mRNAs in central and peripheral tissues and show that both
mRNAs are to be found in the majority of these tissues.
Figure 2:
The structure of the 5`-extremity of the
coding region of the human cannabinoid receptor CB1 cDNA. a,
schematic representation of the 5`-extremity of the coding region of
CB1. CB1 is initiated at ATG1 from an mRNA that is unspliced in the
coding region. CB1A results from a 167-nucleotide excision between
donor (D) and acceptor (A), using ATG2 as the
initiation codon. TM1 codes for the first transmembrane
region. The stripedbox represents sequences common
to CB1 and CB1A, between acceptor (A) and the end of the
coding region (represented by the asterisk). b, open
reading frames of the two human cannabinoid receptor isoform cDNAs and
their deduced amino-terminal sequences. The nucleotide sequences in boldface indicate the consensus donor and acceptor sequences,
with the splice junctions indicated by arrows. The singlyunderlined sequences are the amino-terminal sense and
antisense PCR primer regions, and the doublyunderlined sequence is that of the NH-terminal probe. The asterisks indicate in-phase stop codons, with the sequence in parentheses being that of cann6/BS08. The amino acids common
to the two sequences are in boldface. TMI is the
first transmembrane region. The glycosable asparagines are indicated by
.
Figure 1: The human central cannabinoid receptor CB1 cDNA. a, schematic structure of clones cann6, cann7, and BS08(3) . The shaded boxes represent the coding region; the arrows represent the polyadenylation sites; and the bentlines in BS08 are divergent sequences, discussed in the text. b, partial sequence of clone cann6. The CB1 coding region (shadedarea) is identical for cann6 and BS08(3) . Nucleotides common to both cann6 and BS08 are in upper-case letters; divergences are in lower-case letters for cann6 and in italic letters for BS08. Only the last 28 nucleotides of the 5`-intron in cann6 are given to show the splice acceptor sequence (doubly underlined). The totality of the 3`-UTR of cann6 is shown, with the canonical polyadenylation signal boxed. A possible splice donor sequence, which may account for the difference with BS08 downstream from nucleotide 1546, is singly underlined.
The 3`-extremity of cann6
overlapped with the 5`-extremity of cann7 over 0.95 kb, the sequences
being identical in this region. In contrast to cann7, cann6 contained
only a 0.6-kb 3`-UTR, ending shortly after a polyadenylation signal (Fig. 1b), the first to be found in cann7, but
contained the entire coding region of CB1 plus an upstream region of
1.82 kb. The coding region of cann6 was identical to that
of the original BS08 clone from a human brain stem library(3) .
Taken together, the 1.5 kb of the BS08 clone plus the 4-kb 3`-UTR
identified in cann7 corresponds well with the 6-kb mRNA identified for
CB1 by Northern analysis of RNA from dog, rat, and guinea pig
tissues(2, 3) . Matsuda et al.(2) reported the existence of two alternatively
polyadenylated rat CB1 mRNAs with 3`-UTRs
4.1 and
1.2 kb in
length; the sequences have not been published, but the general
structures of the rat and human sequences are clearly similar.
Two
significant differences were seen between the extremities of BS08 and
the cDNA sequences described here (Fig. 1). The cann6 and cann7
sequences diverge from the BS08 sequence 123 nucleotides after the stop
codon, just after what appears to be a splice donor consensus sequence (singlyunderlined in Fig. 1b). Since
the 64 nucleotides remaining in BS08 have not yet been found farther
downstream in the cann7 sequence, these nucleotides may be
at the start of a nonmatured intron. More interestingly, cann6 diverges
from BS08 63 nucleotides upstream from the initiation codon.
Examination of the cann6 sequence at this point reveals the presence of
a splice acceptor site (doublyunderlined in Fig. 1b). The 1.76 kb at the 5`-extremity of the cann6
sequence would appear to be part of a nonexcised intron, only the
3`-extremity of which is shown in lower-case letters in Fig. 1b. Therefore, the cann6 clone most probably
results from an immature transcript. The sequence of the BS08 clone
upstream from the splice site (in italic letters in Fig. 1b) is not to be found anywhere in the cann6
sequence.
The 3`-extremity of this part of the BS08 clone,
AAGG, could be part of a splice donor site. The full length of the
5`-intron in the human DNA is not known, but is currently being
investigated in our laboratory. Interestingly, the rat CB1 sequence (2) is very similar to the human sequence starting 20
nucleotides downstream from the putative splice acceptor site, but
diverges markedly upstream from this point (not shown), perhaps
indicating that the rat gene also contains an intron in its 5`-flanking
region.
Examination of this region of the CB1 sequence revealed the
presence of consensus splice donor and acceptor sequences in the
precursor mRNA (D and A in Fig. 2, a and b). The shorter, deleted sequence we found
corresponded to a spliced form of the receptor mRNA. In this spliced
form, the translational start codon, AUG1, the first to be encountered
after an upstream stop codon in the cannabinoid receptor gene (CB1 in Fig. 2b), is no longer in frame with the rest of the
coding region, and its use would result in a peptide of only 35 amino
acids. A putative initiation codon, AUG2, for the foreshortened form of
the receptor is a little further downstream, immediately preceded by a
stop codon. Initiation from AUG2 could occur only if AUG1 were a
``leaky'' codon according to the Kozak scanning model of
transcription initiation (20) , thereby allowing the
transcription complex to initiate from the next downstream AUG triplet.
In fact, AUG1, although preceded by G at position -3, is followed
by A at position +3, giving a leaky initiation site, for which
there are numerous precedents(21, 22) . AUG2 has G at
position +4, but T at position -3, again constituting only a
moderately favorable context for translation(23) . If AUG2
initiates translation of this shorter receptor, it would involve a
single frameshift from the reading frame of CB1. The deduced form of
the receptor would be shorter than CB1 by 61 amino acids, and in
addition, the 28 amino acids at its NH-terminal extremity
would be quite different from those of the longer form, being, in
particular, hydrophobic and relatively rich in Pro.
As a result of
the excision, the putative isoform, designated CB1A, lacks two of the
three potential Asn-linked glycosylation sites present in CB1. After
the splice junction, the reading frame of CB1 is restored, and the
remaining amino acids of CB1 and CB1A are identical, including the
remaining 27 amino acids of the NH-terminal region (Fig. 2b and Fig. 6). Experimental evidence that
CB1A protein indeed exists has very recently come from the expression
of the rat CB1 cDNA from a baculovirus expression vector in Sf9
cells(14) . Western immunoblotting and
S metabolic
labeling revealed the presence of a predominant species with a relative
molecular mass of 55 kDa, corresponding to CB1 (calculated molecular
mass of 53 kDa), together with cannabinoid-specific bands at 45 and 50
kDa, which could well correspond to non-glycosylated CB1A (calculated
molecular mass of 46 kDa) and to a glycosylated form, respectively.
Figure 6:
The
deduced NH-terminal extremities of CB1 and CB1A of the
human and rat cannabinoid receptors. The human sequences are inside the circles; the differences found in the rat receptor
are shown at the relevant positions outside the circles. The Gly insert in the rat sequence is shown by the arrow. The amino acids common to the two forms are lightlyshaded, and the glycosable asparagines are darklyshaded. TM1 is the first transmembrane region
beneath the thickline representing the membrane
surface.
Figure 3:
Tissue distribution of CB1 and CB1A
transcripts determined by PCR. The upperpanels are
ethidium bromide-stained 2% agarose gels; the lowerpanels are the respective Southern blots with the bands revealed using
the NH-terminal probe, followed by chemiluminescence. a: lane 1, size markers; lane2,
heart; lane3, colon; lane4,
stomach; lane5, liver; lane6,
pancreas; lane7, placenta; lane8,
lung; lane9, kidney. b: lane1, brain stem; lane2, cortex +
cerebellum; lane3, inferior hemisphere; lane4, temporal lobe; lane5, size markers; lane6, IM-9; lane7, U937; lane8, U373; lane9, peripheral
blood leukocytes; lane10, genomic
DNA.
Figure 4:
Distribution of CB1 and CB1A transcripts
in human and rat brain determined by PCR. a and b (upperpanel), ethidium bromide-stained 2%
agarose gel of amplicons resulting from PCR with
NH-terminal sense and antisense primers of human cDNA
libraries. a: lane1, size markers; lane2, brain stem; lane3, frontal cortex; lane4, cerebellum; lane5,
hippocampus; lane6, occipital cortex; lane7, striatum; lane8, substantia nigra; lane9, temporal cortex. b: lane1, rat astrocytes; lane2, rat
hippocampus; lane3, size markers. The lowerpanel in b is a Southern blot of the upperpanel revealed using the NH
-terminal probe
detected by chemiluminescence.
Figure 5: The structure of the 5`-extremity of the coding region of the rat cannabinoid receptor CB1 cDNA. See legend to Fig. 2b for details. The 5`-UTR sequence in parentheses is identical in both rat and human.
A distinguishing feature of the central cannabinoid receptor is its long extracellular extremity. The discovery that CB1 mRNA exists in two forms differing in the amino-terminal coding region not only poses the question as to the existence and function of two CB1 isoforms, but also opens up an opportunity to assess the importance of this region of the receptor for its interaction with its ligands. In addition to the small coding region intron, a large intron of at least 1.8 kb in length is to be found just upstream of the initiation codon in the human gene (Fig. 1) and, possibly, in the genes of other species. This presents the possibility of further alternative splicing. In view of the potential of cannabinomimetics for the treatment of a variety of ailments, to ensure drug specificity, it is clearly important to ascertain whether other subtypes or isoforms of the receptor exist.
Most of the receptors for small nonpeptidic
molecules have small amino-terminal regions and ligand-binding sites
constituted by pockets formed by the heptahelical transmembrane
regions(25, 26) . The metabotropic glutamate receptor
is unique in that its unusually large extracellular extremity has been
shown to be directly implicated in small molecule
selectivity(27) . All the known cannabinoids, including an
endogenous ligand, anandamide(1) , are small, nonpeptidic
molecules. We have evidence that the truncated and modified isoform
CB1A exhibits somewhat different ligand binding properties than CB1
when expressed in Chinese hamster ovary cells. ()At the
present time, we do not know if the amino-terminal region plays any
structural or functional role. As a result of the splicing, CB1A lacks
two of the three potential glycosylation sites present in CB1. We
cannot assess the importance of this observation at present. However,
Howlett et al.(28) reported that treatment of N18TG2
neuroblastoma cells with tunicamycin failed to alter either
agonist-induced inhibition of cAMP accumulation or desensitization
processes in the presence of cannabinoids, suggesting that
glycosylation is unimportant for activity, but the authors cautioned
that the rate of receptor synthesis and degradation had not been taken
into consideration. Unfortunately, no agonist binding data to correlate
activation with signal transduction were presented. This point remains
to be clarified, as it has been shown that glycosylation can be
important either for signal transduction, as shown in
rhodopsin(29) , or for directing subcellular distribution, as
observed for the hamster
-adrenergic
receptor(30) .
We recently reported the presence of CB1 mRNA
in leukocytes using reverse transcription-PCR(11) . Here, using
the same technique, we show the ubiquitous distribution of both CB1 and
CB1A mRNAs in relative amounts covering a wide range of values (Table 1). CB1A mRNA exists as a minor transcript since at its
highest level it represents only 20% of the total central
cannabinoid receptor mRNA content. In most of the peripheral tissues,
CB1A mRNA was consistently found at
10% of CB1 levels, falling to
<1% in the kidney. The relative levels in the brain were more
variable. CB1A mRNA was almost totally absent in the brain stem and
temporal lobe of the human infant (Table 1), but was found at a
higher level in a cortex-cerebellum fraction prepared from the same
donor. In contrast, in a human brain stem cDNA library derived from a
2-year-old, CB1A mRNA was present at 12% of CB1 mRNA levels. The
overall age-related levels of the central cannabinoid receptor and its
mRNA have been studied in rat
brain(31, 32, 33) . In at least some brain
regions, CB1 levels rise just after birth to attain adult levels,
thereafter decreasing, the decrease perhaps being attributable to a
lower rate of CB1 transcription and/or to messenger
instability(33) ; within these global fluctuations, we do not
know if the CB1/CB1A ratio also varies. It is perhaps relevant that CB1
mRNA exists in at least two 3`-UTR variants, of 0.6 and 4 kb, which may
contain regulatory signals
important for mRNA stability and
for the control of the translation process (see (34) for a
recent review). It follows that a detailed knowledge of the age-related
brain distribution and control of these 3`-UTR variants and of the
amino-terminal isoforms would help toward our understanding of this
receptor.
The existence of two subtypes of the cannabinoid receptor, CB1 and CB2, the first of which exists as two isoforms, will facilitate the discovery and development of cannabinoids having high selectivity and possibly therapeutic potential. One such molecule is the recently described antagonist SR 141716A(13) , which has high affinity and specificity for the central forms of the receptor. In return, these specific ligands will be invaluable tools for facilitating a detailed investigation of the structural features of the various cannabinoid receptors and their physiological significance.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) X81120[GenBank], X81121[GenBank], and X81122[GenBank].