(Received for publication, October 6, 1994)
From the
We report the isolation and the organization of the gene encoding human tryptophan hydroxylase (TPH) and an analysis of the corresponding mRNAs. The gene spans a region of 29 kilobases, which contains at least 11 exons and a variably spliced 5`-untranslated region (5`-UTR). The sequence of the coding region and the majority of the positions of the intron-exon boundaries of human TPH gene are very similar to those encoding human tyrosine hydroxylase and phenylalanine hydroxylase, the other members of the aromatic amino acid hydroxylase family. Phylogenetic analysis evidences the early divergence and the independent evolution of the three hydroxylase types. TPH cDNA cloning and anchored polymerase chain reaction revealed a diversity of the TPH mRNA, which is restricted to the 5`-UTR. Four TPH mRNA species were detected by Northern blot with pineal gland and carcinoid tumor RNAs. These messengers are transcribed from a single transcriptional initiation site, and their diversity results from differential splicing of three intron-like regions and of three exons located in the 5`-UTR. Analysis by S1 nuclease protection revealed that the intron-like regions in the 5`-UTR are mostly unspliced and that TPH mRNA species where the three intron-like regions are eliminated are present at low level in pineal gland and not detectable in carcinoid tumors.
Tryptophan hydroxylase (TPH) ()is the key enzyme in
the biosynthetic pathway of the neurotransmitter serotonin. In mammals,
TPH is only expressed by a small number of tissues, namely the
brainstem raphe nuclei, the pineal gland in the central nervous system,
and the pancreatic and intestinal enterochromaffin cells in the
periphery. The serotonergic neurons of the raphe nuclei form a highly
divergent neuronal system controlling the basic activity of many target
regions distributed throughout the forebrain, the cerebellum, and the
spinal cord. This neuronal system modulates a variety of psychological
and physiological processes including thirst and appetite, sleep and
memory, and reproduction(1) . In the pineal gland, the
concentration of serotonin, which is an intermediate in melatonin
synthesis, is higher than in any other region of the brain or in any
other organ analyzed. The production of melatonin is characterized by a
dark-light circadian rhythm, which synchronizes the circadian and
ultradian cycles involved in a variety of functions including sleep,
sexual behavior, and body temperature(2, 3) . However,
the mechanisms involved in regulating these diverse activities remain
uncharacterized. At the periphery, TPH is also present in the
autonomous nervous system, in the intestinal and pancreatic
enterochromaffin cells, and in carcinoid tumors. These tumors develop
from intestinal enterochromaffin cells, produce large amounts of TPH,
and secrete large amounts of serotonin, accounting for most of the
symptoms associated with this pathology(4, 5) .
TPH
is a member of the aromatic amino acid hydroxylase family, which also
includes tyrosine hydroxylase and phenylalanine hydroxylase. These
three enzymes require a reduced pterin cofactor to hydroxylate their
amino acid substrate and interact with Fe in their
tetrameric quaternary structure to coordinate oxygen
molecules(6) . The human TPH amino acid sequence is similar to
those of tyrosine hydroxylase (7) and phenylalanine
hydroxylase(8) , and the most conserved region comprises the
340 C-terminal amino acids of the proteins. This region shows 49%
identity among the mammalian hydroxylases without any gaps and with
mostly conservative substitutions (the TPH enzyme contains an extra 5
amino acids at the C-terminal end not found in the other enzymes).
Biochemical (9, 10, 11) and sequence analyses (12) suggest that this domain corresponds to the catalytic part
of the enzyme. The N-terminal domain differs more in size and sequence
between the enzymes and has been proposed to modulate the enzyme
activity. Relatively little is known about the regulation and the
biochemistry of TPH because it is not abundant and is extremely
unstable in vitro. Therefore, cloning the gene could lead to a
much better understanding of the biochemistry and regulation of TPH.
Recently, cDNAs encoding rabbit and rat TPH have been cloned from pineal gland, and this was followed by the isolation of homologous human and mouse cDNAs from carcinoid tumor and mastocytoma cDNA libraries, respectively(13, 14, 15, 16) . The coding region of human TPH cDNA extends over 1332 bp, and the deduced amino acid sequence is very similar all along the sequence to those of rat (91.2%(18) ) rabbit (94.6%(13) ), and mouse (90.1%(16) ). The rat TPH gene is characterized by a diversity restricted to the 5`- and 3`-UTR of the TPH mRNAs, which could provide the basis for post-transcriptional regulation of TPH gene expression(18, 19, 20, 21) . We report the isolation of the human TPH gene and show by a combination of cDNA cloning, polymerase chain reaction (PCR) analysis, and S1 nuclease protection an unexpected diversity in the 5`-UTR of TPH mRNAs. These messengers are transcribed from a single promoter, and their diversity results from the conservation of one or several intron-like sequences and from the differential splicing of three exons in the 5`-UTR of the TPH mRNAs.
The coding and noncoding sequences of the human TPH mRNA were aligned with those of the other aromatic amino acid hydroxylases from different animal species. Computer-aided sequence comparisons and analysis were performed with the GCG program(23) . After sequences were optimally aligned, phylogenetic distance trees were constructed by the Neighbor Joining Method of Saitou and Nei(24) , and bootstrap analysis was performed by using the MUST package(25) .
The human TPH cDNA clones fell into two categories, which
differed by the organization of their 5`-leader sequences and their
abundance. The majority of TPH cDNA clones, referred to hereafter as
type 1 cDNAs, all had long 5`-UTR of variable size and probably
corresponded to incompletely reverse-transcribed mRNAs (the longest
type 1 cDNA was 6 kb long with a 5`-UTR of 2.5 kb). In contrast, the
type 2 cDNA clones were about 3.6 kb long with a short 5`-UTR (the
longest type 2 cDNA has a 5`-UTR of 310 bp). They differed from the
type 1 cDNA clones by the absence of a 1.7-kb sequence (named
I) 26 nucleotides upstream of the translation initiation
codon (Fig. 1A). The 5`- and 3`-extremities of the
I
region showed intron splice site sequences, suggesting
that I
may be recognized as an intron in some TPH
transcripts. Therefore, the two types of cDNA clones may result from
alternative splicing in the 5`-UTR; or possibly the type 1 cDNA clones
are TPH precursor RNA, whereas the type 2 is the mature transcript.
Figure 1: Schematic representation of the various 5`-cDNA clones for human TPH and organization of TPH 5`-UTR on genomic DNA. A, two types (1 and 2) of TPH cDNA clones were obtained by cDNA cloning. The two longest species of these different cDNAs were 6 kb and 3.5 kb, respectively. Open and shaded boxes indicate the 5`-noncoding and coding exons, respectively. Thick horizontal lines represent the 3`-UTR and the intron-like sequences in the 5`-leader sequence. The broken lines indicate the elimination of the intron-like region. B, two 5`-noncoding extremities of TPH cDNA clones were isolated by anchored PCR: Slic type 1 and Slic type 3. C, restriction map and organization of the TPH 5`-noncoding region in genomic DNA. The arrow shows the position of the transcription initiation site. Some of the restriction sites present in the TPH gene are shown.
To determine the size and the number of human TPH mRNAs, Northern blots were performed in denaturing conditions with RNA extracted from various tissues that do or do not produce TPH. No hybridization signal was detected with RNA purified from liver and dorsal raphe nuclei. In contrast, the cDNA probe labeled four major transcripts with apparent sizes of 5, 6, 7.5, and 9 kb in mRNA from pineal gland and carcinoid tumors (Fig. 2A). These results agreed with the known distribution of the enzyme in tissues, with the notable exception of the raphe nuclei area of the brainstem, which, as found in the rat, did not show any hybridization signal(20) . The 5-kb species appeared to be the most abundant and was the only signal detected in RNA prepared from the intestine. The high molecular weight TPH mRNA forms were more abundant in carcinoid tumor than in pineal gland RNA.
Figure 2:
Northern blot analysis of TPH mRNA
expression. A, tissue distribution of TPH transcripts. Two
µg of poly(A) RNA from carcinoid tumor (Carcinoid T.), pineal gland (Pineal and Pineal
(5d)), and 20 µg of total RNA from raphe nuclei, colon, and
liver were subjected to gel electrophoresis, blotted onto a nylon
filter, and hybridized to
P-labeled TPH probe. Pineal and
pineal
are identical samples with different exposure times
(pineal, 18-h exposure; pineal
, 5-day exposure). The arrows show the TPH mRNAs and indicate their molecular
weights. B, Northern blot analysis of TPH transcripts with
intron and exon probes. Carcinoid tumor (right panel) and
pineal gland RNAs (left panel) were hybridized with a
1-kb-long TPH coding sequence (lane 1), I
intron-like region (lane 2), I
intron-like
region (lane 3), or I
intron-like region probes (lane 4).
To characterize better the diversity of TPH mRNA, the whole coding sequence and the 3`-UTR were analyzed by PCR amplification from pineal gland and carcinoid tumor cDNAs. A single fragment was detected for each of the four overlapping subregions (defined by the primers) spanning these domains, in agreement with the length of the cloned TPH cDNA sequences (data not shown). In this respect, human TPH mRNA differs from rat and mouse TPH mRNAs, which possess two different 3`-untranslated regions generated by alternative polyadenylation sites. Thus, the diversity of TPH mRNAs may arise from RNA splicing in the 5`-noncoding region. The size of the cDNA clones cannot easily be reconciled with that of the TPH mRNAs detected on Northern blots. This could be because of the cDNA clones being incomplete at the 5`-end. We therefore cloned the entire human TPH gene to facilitate the analysis of the diversity of its mRNAs.
Figure 3:
Structural organization and exon-intron
junctions of the human tryptophan hydroxylase gene. A,
restriction map of the TPH gene. Filled and open boxes indicate the noncoding and coding exons, respectively. Open and filled circles represent HindIII and EcoRI sites, respectively. Three different phage clones
(12,
13,
15) cover the whole TPH gene. B,
exon-intron structure of the human TPH gene. The exon sequences are
denoted in uppercase letters; intron sequences are in lowercase and are given for each junction. The amino acids and
their corresponding positions on the cDNA sequence are also shown underneath each exon-intron
boundary.
Figure 4: Conservation of intron-exon junction positions between the human TPH, tyrosine hydroxylase (TH), and phenylalanine hydroxylase (PAH) genes. A, the three genes are aligned to maximize amino acid identity. Boxes and thick lines indicate the exons and the introns, respectively. Numbers in the boxes indicate the percentages of identity in the exons between the hydroxylase genes. The colors white, gray, and black provide a visual representation of the degree of identity: 0-35%, 36-59%, and 60-100%, respectively. B, phylogenetic distance tree based on the protein sequences of the aromatic amino acid hydroxylases. After sequence alignment, the distance tree was constructed by the Neighbor Joining Method (24) (left; (24) ) and bootstrap analysis (right) as described under ``Materials and Methods.'' The tree was arbitrarily rooted on the tyrosine hydroxylase Drosophila sequence. Note the heterogeneity of the molecular clock between the three subfamilies of hydroxylases, which does not allow the sequence duplications to be dated with confidence.
A phylogenetic distance analysis was performed after having aligned all of the available sequences of the three aromatic amino acid hydroxylases (AAAH) isolated from the different animal species. The topologies of the trees obtained from the nucleotide and the amino acid sequences were identical. Moreover, tree branching was also unchanged by using the whole sequences or only the C-terminal two-thirds of the molecules where the sequences aligned without any gap or deletion. Thus, all the parts of the sequences have presumably evolved at roughly the same relative rates. Each of the three vertebrate AAAH, i.e. tyrosine hydroxylase, phenylalanine hydroxylase, and tryptophan hydroxylase, clearly constitutes a monophyletic group, a contention unambiguously supported by the bootstrap analysis (Fig. 4B). The phylogeny of animal species is correctly reproduced by the AAAH sequences, but the rate of sequence divergence varies significantly among the hydroxylases (note branch lengths for human, rat, and mouse in each of the three groups; Fig. 4B). Thus, there is no satisfactory molecular clock to date with confidence the divergence of the three AAAH. Similarly, it is not possible to determine which of the three AAAH diverged first, although the absence of a bona fide TPH in Drosophila(32) suggests an early divergence of tyrosine hydroxylase before that of TPH and phenylalanine hydroxylase from a common ancestral gene.
One of the genomic clones, 12,
which contained both the 5`-noncoding region of TPH cDNA and upstream
sequences, was analyzed by detailed restriction mapping (Fig. 1C). The sequence of the region upstream of the
translation initiation site on the genomic DNA was identical to the
entire 5`-UTR of the type 1 cDNA clones. It was therefore clear that
the domain I
characterizing the type 1 cDNAs corresponded
to an unspliced, intron-like sequence.
The analysis
of the 5`-ends of TPH mRNA obtained by the anchored PCR technique
confirmed the predominance of the TPH mRNA containing the intron-like
sequence of 1.7 kb (I) and also showed the existence of a
new 5`-noncoding extremity. In short, cloning TPH cDNAs revealed a
diversity in the 5`-UTR of the human TPH mRNAs resulting from the
conservation or the elimination of two intron-like regions, I
and I
.
Figure 5:
Determination of the transcription
initiation site by S1 nuclease protection. Poly(A) RNA
(0.25 µg) from pineal (Pi); carcinoid tumor (CT),
and liver (L) was subjected to S1 nuclease protection
analysis. Pr-S1, probe without S1 nuclease; Pr+S1, probe with S1 nuclease. The arrows show
the protected fragments and their corresponding sizes. A,
probe A was a 0.42-kb SmaI-PstI fragment. B,
probe B was a 0.347-kb HindIII-SmaI fragment. C, probe C was a 0.287-kb EcoRI-HindIII
fragment. D, probe D was a 0.574-kb KpnI-EcoRI fragment. The bottom panel represents the position of probes A, B, C, and D in genomic DNA (horizontal arrows). The vertical arrows indicate the
position of restriction enzyme sites, and the broken arrow indicates the transcription initiation site. Open boxes indicate the 5`-noncoding exons.
The S1-mapping experiment using probe A,
which spans the cap site, also revealed an additional protected
fragment of 29 bases in pineal gland RNA (Fig. 5A). The
presence on the genomic sequence of a 5`-splice donor site 29
nucleotides downstream of the transcription initiation site suggested
that this fragment could correspond to a small exon located at the cap
site. This was confirmed by PCR experiments with a primer bordering the
cap site and another primer localized inside the 5`-end of the type 3
SLIC clone. Sequence analysis of the PCR fragment confirmed the
existence of the 29-base exon (called exon E) and showed
the splicing of the 1.4-kb region (data not shown). In this fragment,
exon E
was joined to a 179-base sequence (named exon
E
) within the 210-bp stretch of the type 3 SLIC clone (Fig. 1C). Therefore, type 3 SLIC clone contained a
part of the region I
, which, like the I
and
I
regions, could be recognized as intron but is present
in most TPH mRNA extracted from the pineal gland and carcinoid tumors
(see above).
Figure 6:
PCR analysis of the 5`-UTR of human TPH
mRNA. PCR was performed with specific primers after first-strand cDNA
synthesis using poly(A) mRNA isolated from carcinoid
tumor (CT) and pineal gland (Pi). Open and hatched boxes represent the noncoding and coding exons,
respectively. The organization of the 5`-noncoding extremity for each
PCR fragment is schematized. A, radiolabeled PCR with specific
TPH primers (O
and O
). B, radiolabeled
PCR with specific TPH primers (O
and O
). The
products of the two sets of primers were separated on a 5% denaturing
polyacrylamide gel and visualized by autoradiography. C, PCR
with primers overlapping exons E
,E
and
E
,E
. The products of PCR were separated on a 3%
agarose gel, transferred to a membrane, and hybridized with a labeled
oligonucleotide (SBCRIB).
Figure 7:
S1 nuclease analysis of the 5`-UTR exons (C), 5`-intron-like regions (A, B), and
coding region (D) of TPH mRNA. Poly(A) RNA
(0.25 µg) from pineal (Pi); carcinoid tumor (CT),
and liver (L) was subjected to S1 nuclease protection. Pr-S1, probe without S1 nuclease; Pr+S1, probe
with S1 nuclease. A and B, S1 nuclease mapping with
the E and F probes containing 290 and 362 nucleotides from the
intron-like regions. C, probe G was 250 nucleotides long and
contains a part of the exon E
(117 nucleotides). D, probe H was 330 nucleotides long and covered three coding
exons. The arrows show the protected fragments for each probe
used. The bottom panel shows the positions of probes (E, F, G,
H) in genomic DNA (horizontal arrows). Open and shaded boxes indicate the 5`-noncoding and coding exons,
respectively.
The low abundance of
the spliced TPH mRNA 5`-UTR also was confirmed with two other probes (I
and J). Probe I corresponds to 187 bases of the 5`-noncoding extremity
of TPH mRNA and contains the first noncoding exon linked to the exon
E (Fig. 8A). The entire probe was fully
protected by the pineal gland RNA, showing that short mRNA
5`-extremities are present in normal tissue. Only a very weak
protection was obtained in the carcinoid tumor, where splicing of
intron-like regions in the TPH 5`-UTR is rare. In contrast, probe J
(349 bases), complementary to the sequence of the type 3 SLIC clone,
was poorly protected in the two tissues (Fig. 8B).
Therefore, the type 3 SLIC clone isolated by anchored PCR is a minor
form of TPH mRNA.
Figure 8:
S1 nuclease analysis of the TPH 5`-UTR
region. Poly(A) RNA (0.25 µg) from pineal gland (Pi), carcinoid tumor (CT), and liver (L)
were subjected to S1 nuclease protection. Pr-S1, probe without
S1 nuclease; Pr+S1, probe with S1 nuclease. A,
S1 nuclease mapping with probe I of PCR fragment (187 bases)
corresponding to one of the TPH 5`-noncoding extremities. B, probe J is 349 bases long and corresponds to the type 3 SLIC
clone. The arrows show the fragments protected by each probe.
The schematic organization of each protected fragment is represented. Open boxes indicate the 5`-noncoding exons and horizontal
lines indicate the intron-like region.
Finally, to determine which of the three
intron-like regions was retained in the 5`-UTR of the major species of
TPH transcripts, probes specific to these three regions were generated
by PCR amplification and hybridized to different Northern blots (Fig. 2B). One intron, the I, contained
three antisense Alu sequences separated by up to a few hundred bases of
non-Alu DNA. Thus, we chose an intronic probe in a region located
outside of the repetitive elements. However, no simple conclusion can
be drawn from these hybridizations. In both carcinoid tumor and pineal
gland, the major 5-kb band was recognized by each of three probes,
suggesting that it may correspond to several species of TPH mRNA with
different 5`-UTR but similar size. In practice, it was very difficult
to quantify the relative abundance of the various other transcripts
labeled by the intronic probes from the RNA material available.
Therefore, although each of the mRNA bands detected on Northern blot
may correspond to a different and complex exon-intron arrangement in
the 5`-UTR, it was impossible to unravel the organization of these TPH
transcripts by simple hybridizations to the blots. Nevertheless,
Northern analysis supported the main conclusions of the extensive and
more accurate nuclease protection experiments, which were that the TPH
transcripts having eliminated the three intron-like regions in the
5`-UTR generally represented a minor population as compared with the
partially spliced TPH mRNAs, a situation that is much more pronounced
in carcinoid tumor than in pineal gland.
The study of the human TPH mRNA by cDNA cloning, anchored PCR amplification, and S1 nuclease protection assays revealed a very unusual organization of its 5`-UTR. Human TPH mRNA exhibited a large diversity in the 5`-leader sequence, whereas the coding region was identical in all of the tissues studied. Four TPH transcripts were visualized by Northern blotting of both pineal gland and carcinoid tumor RNA. To unravel this complex organization, the corresponding gene was isolated and mapped. The sequence and the locations of the intron-exon junction of the human TPH gene revealed very strong similarity to those of the genes encoding other aromatic amino acid hydroxylases (AAAH).
The human TPH locus spans 29 kb and contains at
least 11 exons, and its mRNA appears to undergo differential splicing
in the 5`-UTR. The locations of intron-exon junctions of the mammalian
AAAH genes are very similar, particularly in the region corresponding
to the catalytic core of the enzyme (see Fig. 4). The only
exceptions are one intron specific to tyrosine hydroxylase (I-6)), one
specific to phenylalanine hydroxylase (I-12), and one common to
tyrosine hydroxylase and phenylalanine hydroxylase, which is absent
from TPH and could therefore have been lost during the course of
evolution. The 5`-extremity of the fourth exon is 4 amino acids longer
in the TPH gene than in the corresponding exon of the other
hydroxylases. The N-terminal region is less well conserved and encoded
by a number of exons that varies from one enzyme to another, and even
among mammalian species, as in the case of tyrosine
hydroxylase(7) . Interestingly, the junction between the
regulatory and catalytic domains of the proteins corresponds to an
intron-exon junction in all the genes of the family. Among the
hydroxylase genes, the mouse and human TPH genes alone have introns in
the 5`-UTR (one and three introns, respectively). It is also very
likely that an intron is present in the 5`-UTR of the rabbit TPH mRNA.
Indeed, the 5`-leader sequence of the rabbit TPH cDNA shares 74.4%
identity and 85.7% identity, respectively, with the human TPH exons
E and E
, which border the 5`-UTR and are
separated by introns (Fig. 9). However, this suggestion awaits
experimental confirmation. In addition, the good conservation of the
TPH 5`-UTR sequence between human and rabbit is not found in the other
known mammalian species, indicating major sequence shuffling in this
region.
Figure 9:
Comparison of human and rabbit TPH gene
5`-UTR sequences. A, nucleotide sequences of human and rabbit
TPH are numbered with respect to the transcription start site and to
the first nucleotide of the cDNA, respectively. =, nucleotides
conserved between the two species. B, schematic representation
and alignment of the rabbit TPH 5`-UTR with the exons (E and E
) and part of the I
intron of
human TPH 5`-UTR. Hatched boxes represent the human or rabbit
exons of the TPH gene 5`-UTR. Thick lines indicate the introns
of the TPH gene 5`-UTR.
Phylogenetic distance analysis, using either the amino acid or the nucleotide sequences of this gene family, only partly supports the conclusions drawn by Woo and colleagues (12, 35) about the evolution of the AAAH genes. These authors proposed that two major gene duplications have occurred; the first one separated tyrosine hydroxylase from the common ancestor, and the second one gave birth to phenylalanine hydroxylase and TPH. However, the uncertainty about the regularity of the molecular clock in this protein family (Fig. 4B) and the small number of sequences available from animal species belonging to different phyla do not allow the duplications to be dated with confidence. Nevertheless, it was recently proposed that Drosophila melanogaster possesses only two aromatic amino acid hydroxylase genes, one being tyrosine hydroxylase-homologous and the other having both phenylalanine hydroxylase and TPH activities (32) . If one could rule out the possibility that one of the hydroxylases was eliminated as redundant in the Drosophila phylum, it should be proposed that the first duplication occurred presumably before and the second one after the divergence of arthropods from the other taxa (600 million years ago).
Human TPH mRNAs are characterized by large diversity within, and restricted to, the 5`-noncoding region. This diversity results from the conservation of one or more intron-like regions in the 5`-leader sequence of the TPH mRNA and from differential splicing of three exons in the spliced TPH mRNAs 5`-UTR. Generally, mRNAs that display a 5`-UTR diversity, for example the genes encoding mouse choline acetyltransferase(36) , human insulin-like growth factor II(37) , and aldolase-A gene(38) , are transcribed from alternative promoters, followed by the splicing of intervening sequences in the 5`-UTR. An extreme example is the hydroxymethylglutaryl-CoA reductase gene, where a more complex mechanism involves the combination of multiple transcription initiation sites and various 5`-splice donor sites for one intron(39) . This diversity within the mRNA 5`-leader sequences is therefore associated with the use of alternative promoters, which could be preferentially activated in particular tissues or stages of development(37, 40, 41) . In the case of the human TPH gene, however, the multiple mRNA species are transcribed from a single promoter, and the variety of TPH messengers is the result only of the differential splicing of three intron-like regions and of the three exons located in the 5`-UTR.
It is surprising that the three intron-like sequences in the 5`-UTR of TPH mRNAs are in many cases retained when the introns of the coding region are eliminated. Northern blotting clearly identified high molecular weight TPH transcripts, which may result from differential splicing, generating unusually long 5`-noncoding sequences. These transcripts are more abundant than would be expected of processing intermediates. This led to their isolation directly from the screening of the cDNA library, a rather uncommon event. PCR experiments only allowed the cloning and the characterization of several rare, differentially spliced TPH mRNA species where the three intron-like regions are eliminated. Generally, mRNAs with long 5`-leader sequences correspond to precursors. In this latter case, the abundance of long 5`-UTR in TPH mRNAs should imply that the excision of three intron-like regions is suffi ciently slow and that these messengers are sufficiently stable to allow their accumulation. The limiting step of the TPH mRNA processing could be the splicing of the 5`-leader sequence rather than nuclear RNA degradation. Therefore, there appear to be two steps in the processing of human TPH mRNA. The first is rapid, eliminating the introns of the coding region. The second is slower, leading to a complex pattern of 5`-UTR maturation.
The mRNAs not containing region I (such as
the type 2 TPH cDNA), are characterized by the presence of a
supplementary in-phase AUG codon, 27 bases upstream of the presumed
translation start site. The putative use of this initiator codon would
generate a longer N-terminal sequence. Nevertheless, it remains to be
determined whether or not this protein is produced and whether the two
resulting proteins possess the same characteristics (i.e. stability or activity). The recent cloning of Xenopus laevis TPH cDNA has shown that it potentially encodes a TPH protein with
37 extra amino acids at the N terminus as compared with TPH in other
species(42) . To date, this extension has no known functional
consequences.
It is generally thought that the 5`-UTR contributes to the stability (43) and to the regulation of translation of the messengers(44, 45) . However, the presence of long 5`-noncoding sequences in TPH mRNAs poses many problems with regard to translation mechanisms. The initiation of translation in higher eukaryotes is modulated by several structural features in the 5`-untranslated region of mRNA. They include the m7G cap, the position of the AUG codon, the length of the leader sequence, and secondary structures(46) . mRNAs with long 5`-UTR could correspond to precursors or to otherwise nonfunctional transcripts. Indeed, translation initiation optimally requires a short 5`-noncoding region and no AUG codon upstream of that used to initiate translation(47) . The introns that are retained in the 5`-leader sequence considerably impair translation efficiency because they often contain an AUG-burdened leader sequence. Several AUG codons upstream from the translation initiator AUG are present in the type 1 and type 2 human TPH cDNA clones. All of these upstream AUG codons are followed by short open reading frames that could potentially encode peptides. According to the scanning model of translation, these AUG-burdened RNA sequences corresponding to the high molecular weight TPH transcripts are expected to be poorly translated, a characteristic that could be compensated for by the abundance of these mRNAs. In this case, these mRNAs would be translated without additional maturation. In addition, there have been several reports of abundant, incompletely spliced transcripts that enter the cytoplasm (48) and also which have been found on polysomes (49) . This observation suggests that the introns, when they are maintained in the transcripts, could play a role in the regulation of gene expression.
In contrast, it has recently been shown that precursor RNAs can be synthesized and stored for later processing(50) . In this model, the large TPH RNAs would be precursors, to be translated only after a maturation step, a mechanism that easily accounts for the abundance of TPH mRNAs bearing long 5`-UTR relative to those short 5`-UTR. Only the TPH mRNAs with a short 5`-leader sequence would be effectively translated, and the low abundance of these transcripts could result from their rapid degradation. The conversion of a stable, untranslatable precursor to a functional mRNA generates a supplementary step in the regulation of gene expression.
Finally, internal translation intiation as described for some viral and eukaryotic genes (51, 52, 53) could also explain the abundance of long TPH mRNAs. This translation initiation mechanism allows messengers with long 5`-leader sequences to be efficiently translated. Each of these three models could account for the large difference between the amounts of human TPH mRNAs with long and short 5`-UTR, and the intracellular localization of the high molecular weight TPH mRNAs may indicate whether or not they can be translated. In any case, the diversity exhibited by the 5`-UTR of the human TPH mRNAs may play a physiological role in the production of TPH enzyme. It increases the possibility of modulation of TPH gene expression at post-transcriptional and translational levels.
Another particularity of TPH mRNA expression is the discrepancy between the tissues in which the TPH enzyme and mRNA is found. Northern blot analysis detected TPH transcripts in the pineal gland, intestine, and carcinoid tumor but not in the brainstem raphe nuclei, which nevertheless contain TPH. There have been similar observations in rat, rabbit, and mouse(13, 16, 18) . The discrepancy between TPH mRNA and protein levels in the brainstem could be explained by (i) the existence of another TPH gene expressed specifically in the raphe nuclei, (ii) better translation efficiency of very small amounts of TPH mRNA, or (iii) enhanced stability of the TPH protein. Measurements of the TPH gene transcription rate have shown that the level of gene expression was similar in the pineal gland and in the brainstem, suggesting post-transcriptional or translational regulation of the TPH mRNA(21) . It is possible that no high molecular weight TPH mRNAs are transcribed in human raphe nuclei brainstem and that only TPH transcripts with spliced 5`-UTR are synthesized. These short mRNAs may be efficiently translated and then rapidly degraded. In the carcinoid tumors and pineal gland, large amounts of TPH mRNAs are produced. Surprisingly, no short 5`-leader sequences of TPH messengers are detected in the carcinoid tumors by S1 nuclease protection, although they are in the pineal gland. The abundance of these high molecular weight TPH mRNAs in carcinoid tumors could reflect a high transcription rate or RNA stability peculiar to the mitotic character of this tissue. Although these tumors synthesize and secrete very high levels of serotonin, it is not known if the pathological cells produce more active TPH than healthy enterochromaffin cells.
In conclusion, the
cells expressing the TPH gene contain a large and complex variety of
TPH mRNA forms differing in the 5`-UTR. Although the functional
consequences of this phenomenon are only beginning to be investigated,
it provides interesting clues to novel mechanisms of regulation of gene
expression. An important aspect of TPH expression in the pineal gland
is its rhythmicity. In rat, TPH activity and the mRNA levels have been
shown to vary during the circadian rhythm(42, 54) . ()This type of variation could imply an integrated
regulation of TPH gene expression. An attractive hypothesis is that it
evolves from differential splicing events leading to mRNAs, which
differ only by their 5`-leader sequence.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) X83213[GenBank].