(Received for publication, October 10, 1995; and in revised form, December 18, 1995)
From the
The seven-transmembrane segment thrombin receptor (TR)
represents the prototype of a putative family of proteolytically
cleaved receptors that may include the proteinase activated receptor-2.
A panel of somatic cell hybrids retaining distinct portions of human
chromosome 5 were used to establish that the human TR gene is present
as a single-copy locus within the region 5q11.2 q13.3,
confirming our previous localization using fluorescent in situ hybridization analysis. To further characterize the TR gene,
overlapping clones from a human genomic library were isolated. Genomic
analysis confirmed that the TR gene is of limited complexity, spanning
27 kilobases and containing two exons separated by a large
22-kilobase intron. The larger second exon contains the majority
of the coding sequence and the thrombin cleavage site, remarkably
similar to the organization of the proteinase activated receptor-2 gene
in which the putative cleavage site is also contained within the large
second exon. Primer extension analysis using two 30-mer oligonucleotide
primers known to be contained within the first exon identified the
predominant transcription initiation site 351 base pairs upstream from
the initiator methionine in both human umbilical vein endothelial and
human erythroleukemia cells. Sequence analysis of the 5`-flanking
region revealed the TR promoter to be TATA-less, although nucleic acid
motifs potentially involved in transcriptional gene regulation were
evident and include a GATA motif, octamer enhancer sequences, AP-2-like
sites, and Sp1 sites. These data provide evidence for remarkable
similarity at the gene level between both proteolytically cleaved
receptors described to date.
The serine protease -thrombin plays a critical role in
hemostasis and thrombosis via interactions with specific
coagulation proteins and cells diversely involved in regulatory
functions of the vessel wall.
-Thrombin is among the most potent
of the physiological stimuli for platelet aggregation(1) ,
modulates the endothelial cell hemostatic
response(2, 3, 4) , and is mitogenic for
vascular smooth muscle cells (5) and fibroblasts(6) . A
G-protein-coupled thrombin receptor (TR) (
)structurally
similar to other members of the seven-transmembrane segment receptor
family (7) has been isolated from a megakaryocytic (Dami) cell
line(8) . The cDNAs for similar receptors have been identified
and cloned from human endothelial cells(9) , CCL39 hamster lung
fibroblasts(10) , and rat vascular smooth muscle
cells(11) . Activation of the receptor by
-thrombin and/or
synthetic ligands representing the new N terminus after thrombin
cleavage(12, 13, 14) results in dual
coupling to phospholipase C and adenylyl cyclase(15) .
Molecular mechanisms of thrombin receptor activation have been studied
by this and other laboratories, and these results suggest that critical
structural determinants regulating receptor activation exist within the
long extracellular domain and the second extracellular
loop(16, 17) .
Despite the extensive and rapid accumulation of data directed toward elucidation of cellular activation mechanisms mediated by this receptor, little is known about the molecular genetics of the thrombin receptor. The concept of an extended gene family has recently been underscored with the isolation and cloning of a second proteinase-activated receptor (PAR-2)(18) . Like the thrombin receptor, PAR-2 is activated by proteolytic cleavage and by synthetic peptides corresponding to the new N terminus after cleavage. Whereas trypsin unequivocally activates this receptor, the presence of additional physiological enzyme agonist(s) remains unproven although probable (18) . We have now completed the molecular characterization of the human thrombin receptor gene and provide further evidence for remarkable similarity at the gene level between the human thrombin receptor and PAR-2 genes. These data provide conceptual support for the presence of a more extended gene family of proteolytically cleaved receptors that may have evolved from a common primordial gene.
A human genomic library cloned into the
bacteriophage EMBL3 was kindly supplied by Dr. W. Schubach (SUNY at
Stony Brook). Library screening was completed with the P-radiolabeled TR cDNA insert essentially as described
previously(20) , utilizing Escherichia coli host
strain NM539. Positive phage clones were plaque-purified, and the DNA
was purified from minilysates by standard methods(21) .
Alternatively, P1 genomic clones were obtained by PCR using
oligonucleotide primers spanning the second exon (Genome Systems, Inc.,
St. Louis, MO). Genomic fragments were extensively characterized by
end-ordered partial digestion and Southern blot analysis, and
individual fragments were subcloned into pBluescript (Stratagene, La
Jolla, CA) or M13mp18 (Sigma) for sequence analysis using dideoxy chain
termination(22) . Exon-intron boundaries were defined by
comparison of genomic DNA sequence with that of the published
cDNA(8, 9) . Sequence analysis was performed using the
Wisconsin Genetics Computer Group Package(23) .
Figure 1:
Genomic analysis using human:hamster
somatic cell hybrids. Approximately 10 µg of high molecular weight
DNA from HHW105 (containing a single human chromosome 5 as its only
human component), HHW213 (containing a single human chromosome 5
lacking 95% of the long arm of chromosome 5 with an intact 5p),
HHW1064 (containing a single human chromosome 5 with a deletion within
the region 5q11.2-5q13.3) or total human genomic DNA was
restricted with EcoRI, size-fractionated, and evaluated by
Southern blot analysis using the radiolabeled TR cDNA as probe. A
single, hybridizing fragment is evident only with DNA from HHW105 and
total genomic DNA, confirming that the TR is present as a single-copy
gene within 5q11.2
5q13.3. The relative positions of
HindIII-digested
phage DNA fragments used as size
markers are indicated.
Figure 2:
Southern blot analysis. 10 µg of human
genomic (or rhesus monkey) DNA were digested with individual
restriction enzymes and evaluated by Southern blot analysis using the P-radiolabeled 5`-open reading frame cDNA as probe (A) or the 3`-untranslated region cDNA as probe (B).
In both situations, no more than two cross-hybridizing fragments are
evident using all restriction enzymes evaluated, confirming that the
thrombin receptor gene is of limited size and complexity. The pattern
using monkey-restricted DNA (A, last lane) generates
a fragment of identical size and intensity as its human homologue,
suggesting that this region of the gene is highly conserved in nonhuman
primates. The relative positions of HindIII-digested
phage DNA fragments used as size markers are
indicated.
To determine if structurally related genes are present in humans, individual filters were stripped, and Southern blot analysis was repeated under low stringency conditions. No novel cross-hybridizing fragments were demonstrable, inconsistent with the presence in the human genome of a structurally related pseudogene. Thus, although a second proteolytically cleaved receptor has been recently identified (18) , and the presence of other thrombin receptors has been postulated, these data confirm that they are not highly homologous to the TR. Indeed, the murine putative proteinase-activated receptor (PAR-2) displays only 28% identity to the murine and 30% identity to the human thrombin receptor at the protein level, although certain regions within the transmembrane and extracellular loops appear more highly conserved(18) .
To more precisely characterize the TR genomic organization, we initially employed a comparative PCR strategy using total genomic DNA or reverse-transcribed endothelial cell RNA as templates. Distinct oligonucleotide primer pairs spanning the full-length cDNA effectively amplified the identically sized fragments from base pair 490 to the 3`-end of the cDNA (data not shown). We were unable, however, to amplify the remainder of the 5`-sequence using total genomic DNA, suggesting the presence of a large intron upstream of this region.
The initial characterization of the gene was then
confirmed by isolating genomic clones encompassing the TR.
Approximately 1 10
plaques were screened from a
human genomic bacteriophage
library using the TR cDNA as probe
with the isolation of a single
18-kb genomic clone (
11A-1).
Southern analysis confirmed that this fragment contained the majority
of the coding sequence, although it lacked the 5`-untranslated region
and first exon. Despite repeated screening using various 5`-fragments,
we were unable to isolate genomic fragments containing this portion of
the TR gene. Accordingly, we then used oligonucleotide primers to
screen P1 clones with the isolation of two clones, P4249 and P4250.
Southern blot analysis confirmed that only P4250 contained the entire
TR gene and that all genomic clones could be resolved by a common
restriction map. Simultaneous genomic analysis using both human genomic
DNA and DNA from individual clones established that the thrombin
receptor gene spanned
27 kb and contains two exons separated by a
large
22-kb intron (see Fig. 3). The larger second exon
contains the majority of the coding sequence and the thrombin cleavage
site, remarkably similar to the organization of the PAR-2 gene in which
the putative cleavage site is also contained within the large second
exon. Furthermore, the first exons of both cDNAs encode precisely 29
amino acids(25) , again highly indicative of a conserved
evolutionary pattern from a common primordial gene.
Figure 3: Schematic diagram displaying the structural organization of the thrombin receptor gene. Exons are indicated by solid rectangles. Relevant restriction endonucleases utilized for genomic mapping are indicated: H, HindIII; S, SalI; E, EcoRV; P, PvuII. Lambda phage and P1 clones with approximate ends are depicted.
The thrombin
receptor exons, intron/exon boundaries, and portions of flanking
introns were bidirectionally sequenced for further analysis. As
demonstrated in Table 1, intron/exon boundaries conformed to the
known GT/AG splice donor/acceptor rules as described previously by
Mount(26) . The single intronic splice junction encompassing
the TR coding sequence occurred after the first nucleotide of the codon
29 triplet, indicative of a Type I splice site. The 3`-border of exon 2
diverged from the initial published sequence precisely at the poly(A)
tail (8) and contained an adenosine-enriched region typical of
a polyadenylation tract. Sequence analysis of the entire coding
sequence proved to be identical to the published cDNA, except for the
presence of a CG inversion at nucleotides 935-36 (CG GC,
Leu
, unchanged; Val
to Leu
),
as described previously in the endothelial cell TR cDNA
homologue(9) .
Figure 4:
Determination of the TR transcription
start site by primer extension analysis. The P-radiolabeled 30-mer oligonucleotide 1715 was annealed to
20 µg of total cellular RNA from human umbilical vein endothelial
cells (lane 1), HEL cells (lane 2), or 5 µg of
HEL cell poly(A) RNA (lane 3) for primer extension analysis as
outlined under ``Materials and Methods.'' The product was
analyzed by acrylamide gel electrophoresis in parallel with a
sequencing reaction using the identical oligonucleotide primer and the
3-kb genomic fragment cloned into the HindIII site of
M13mp18 known to contain the first exon and 5`-untranslated region (see Fig. 3). A single extension product corresponding to a guanine
nucleotide (complementary strand sequence) 351 base pairs upstream from
the initiator methionine is seen with all three
samples.
Figure 5:
Analysis of the thrombin receptor gene
5`-flanking sequence. A, sequence analysis of the
5`-regulatory region is displayed with relevant restriction sites
indicated. The Alu-like repeats are underlined, the
transcription initiation site is delineated by the arrow, the
start of thrombin receptor cDNA clone 4-1 (9) is
depicted by the black diamond, and the start of the original
published cDNA sequence (8) is represented by the star. The location of primer ON 1715 used for primer extension
analysis in Fig. 4is depicted by the thick black
line above the sequence. B, schema summarizing the
putative transcriptional regulatory sequences identified within the
5`-regulatory region. This sequence has been deposited into the GenBank
data base and assigned the accession number
U35634.
The
seven-transmembrane segment thrombin receptor represents the prototype
of a novel class of proteolytically cleaved receptors that mediate
signaling events by functional coupling to G-proteins. The
identification of PAR-2 reinforced the concept that circulating
proteases (in addition to -thrombin) may affect cellular events
through such proteolytically cleaved receptors. Although a
physiological enzyme substrate for PAR-2 has not been identified,
preliminary observations from other laboratories suggest that both
receptors display similar activation mechanisms. The data presented in
this manuscript demonstrate that these functional properties also
extend to the structural organization of the genes. Both genes contain
two exons separated by a large intron, both genes encode identical
numbers of amino acids within the first exon, and the cleavage sites
for both gene products are similarly contained within the larger second
exon. Thus, although the proposed gene family is currently limited to
two family members, we would speculate that other similarly organized
genes are present in humans, presumably evolving from a common
ancestral gene.
Addendum-Since the submission of this manuscript, the human homologue of the PAR-2 gene has been isolated and characterized(36) . Like the TR and murine PAR-2, the human PAR-2 gene has essentially the identical genomic organization. Interestingly, human PAR-2 co-localizes with the human TR gene at 5q13.