(Received for publication, May 24, 1996, and in revised form, November 22, 1996)
From the Department of Molecular Biology, University
of Geneva, Sciences II, 30, Quai Ernest-Ansermet, CH-1211
Geneva 4, Switzerland and the ¶ Department of Biology, Faculty
of Science, Chiba University,
1-33 Yayoi-cho, Inage-ku, Chiba 263, Japan
The gene encoding the TATA-binding protein, TBP,
is highly overexpressed during the haploid stages of spermatogenesis in
rodents. RNase protection analyses for mRNAs containing the
previously identified first, second, and eighth exons suggested that
most TBP mRNAs in testis did not initiate at the first exon used in somatic cells (here designated exon 1C). Using a sensitive
ligation-mediated cDNA amplification method, 5 end variants of TBP
mRNA were identified, and the corresponding cDNAs were cloned
from liver and testis. In liver, a single promoter/first exon is used
to generate a steady-state level of roughly five molecules of TBP
mRNA per diploid cell equivalent. In testis, we detect modest
up-regulation of the somatic promoter and recruitment of at least five
other promoters. Three of the alternative promoter/first exons,
including 1C and two of the testis-specific promoter/first exons, 1D
and 1E, contribute roughly equivalent amounts of mRNA which, in
sum, account for greater than 90% of all TBP mRNA in testis. As a
result, round spermatids contain an estimated 1000 TBP mRNA
molecules per haploid cell. Testis TBP mRNA also exhibits several
low abundance 5
end splicing variants; however, all detected TBP
mRNA leader sequences splice onto the common exon 2 and are
expected to initiate translation at the same site within exon 2. The
precise locations of the three major initiation exons are mapped on the
gene. The identification of the strong testis-specific promoter/first
exons will be important for understanding spermatid-specific tbp
gene regulation.
The TATA-binding protein, TBP,1 is required for transcription initiation by all three nuclear RNA polymerases (1). Somatic tissues contain from 1 to 5 TBP mRNA molecules per cell; these modest differences correspond qualitatively with relative differences in overall transcriptional activity in nuclei from different tissues (2, 3). In contrast, adult rodent testes exhibit roughly 350 molecules of TBP mRNA per cell (2). Testis-specific TBP mRNA overexpression arises primarily as a result of transcriptional up-regulation (2).
During puberty, the first stem cells to undergo spermatogenesis do so almost synchronously, such that different stages of puberty can be correlated to the first appearance of a specific spermatogenic stage in the testis (4). The onset of TBP overexpression during puberty correlates to the appearance of the first haploid cells. Moreover, as shown by immunocytochemistry, TBP protein overexpression occurs in the early haploid cells (2). Because these data suggest that TBP overexpression is restricted to only a subset of the cells in testis, we estimate that TBP mRNA levels in these cells are more than 1000-fold greater than those in somatic cell types. This high level of cell type-specific overexpression makes us suspect that TBP plays a role in spermatogenesis that differs from its ubiquitous functions in somatic cells.
With the goal of understanding the mechanisms regulating
spermatid-specific tbp gene expression, we have performed a
molecular analysis of the 5 end of the tbp gene. Sumita
et al. (5) previously reported the intron/exon structure of
the mouse tbp gene from exons 2 to 8 (formerly designated
exons 1-7). In brain, spleen, and liver, use of a single upstream
promoter/first exon has been reported (6). Quantitative analyses
presented here show that testis-specific tbp expression
involves both modest up-regulation of the somatic tbp
promoter and recruitment of at least 2 other major and 3 minor
promoters. Using a sensitive method for amplifying and cloning cDNA
5
ends, we characterize 10 TBP mRNA 5
end variants in testis that
differ in promoter usage and/or splicing. The relative contribution of
each variant to total testis and liver TBP mRNA levels, the gene
structure, and the entire 5
end genomic sequence are reported.
Genomic sequencing was performed on mouse 129 ES cell genomic clones. All other samples were prepared from fresh tissues harvested from MORO mice or Sprague-Dawley rats as indicated. Total RNA was prepared by sedimentation through CsCl cushions as described previously (3); nuclear RNA was prepared from citric acid-prepared nuclei sedimented through two sucrose cushions followed by purification of RNA through CsCl cushions as described previously (2).
RNase Protection AssaysRNase protection assays were
performed as described previously using the indicated amounts of sample
RNA supplemented with yeast RNA to 100 µg (2). Control lanes
contained probe and 100 µg of yeast RNA; probe control lanes
contained roughly a 1:100 dilution of nondigested probe and 10 µg of
yeast RNA carrier. Pseudo-pre-mRNA was transcribed from a mouse
genomic clone extending from the SacI site 700 bp upstream
of exon 1C to the SacI site in exon 3 using T3 RNA
polymerase; yield was determined spectrophotometrically (7). The
pseudo-pre-mRNA migrated as a single 5.3-kilobase band on agarose
gels, thus validating molar comparisons for probes hybridizing to
different regions of the pseudo-pre-mRNA. Pseudo-mRNA containing
exons 2-8 was transcribed from a mouse cDNA clone containing sequences beginning near the 5 end of exon 2 (37 bp upstream of the
BglII site) and extending through the poly(A) tail.
Pseudo-mRNAs containing exons 1D-2 or exons 1E-2 were transcribed
from clones 62 and 36, respectively, of the rat 5
end amplification
products shown in Fig. 4.
Probes used in this paper were as follows. The
BglII/EcoRI exon 2/3 probe was described
previously (2). The BglII/StuI and
Sau3AI/StuI probes were both transcribed from a
subcloned genomic fragment extending from the SacI site at
700 to the StuI site 35 bp downstream of exon 1C
linearized with BglII or Sau3AI, respectively.
The Sa/X probe was transcribed from a fragment spanning from the
SacI site at
700 to the XhoI site 343 bp
downstream of exon 1C linearized with Sau3AI. The X/A probe
was transcribed from a fragment spanning from the SacI site
at
700 to the ApaI site 1280 bp downstream of exon 1C
linearized with XhoI. The A/R1 probe was transcribed from a
subclone of the 699-bp ApaI/EcoRI fragment. The
R1/Bg probe was transcribed from a subclone of the 1043-bp
EcoRI/BglII fragment. The exon 1C/2 probes were
transcribed from clones 56 (mouse) or 1 (rat) of the 5
cDNA
amplification products; the exon 1D/2 probes were transcribed from
clones 95 (mouse) or 62 (rat); and the exon 1E probes were transcribed
from clones 132 (mouse) or 36 (rat).
General
methods are described elsewhere.2 cDNA
was synthesized from 1 µg of total or poly(A)+-selected
mRNA using a primer that hybridizes to mRNA sequences in
tbp exon 6 (5-CCATGTTCTGGATCTTGAAG-3
). A universal adapter was ligated to the 3
end of the cDNA. TBP cDNAs were amplified by using an anti-adapter primer and, first, a primer specific for
sequences in exon 4 (5
-GAAGTGCAATGGTCTTTAGGTCAAGTTTACAG-3
), followed
by a 25-base primer (5
-CTC
TCCCTAGAGCATCCTC-3
), which spanned the BglII site in exon 2 (underlined).
Amplified cDNAs were cloned into
BamHI/NotI-cut pBluescript KS+ plasmids (Stratagene) using the BglII site in exon 2 and a
NotI site in the adapter primer. Clones were sequenced using
the BglII-containing exon 2 primer. Plasmids with no inserts
(2% of the clones) would not sequence with this primer. False-primed
cDNA products did not exhibit the remaining 68 bases of exon 2 but
rather gave unrelated sequences (about 20% of the clones).
Poly(A)+ mRNA
samples (20 µg) were mixed with 5.0 fmol of
32P-end-labeled primer (either the
BglII-containing primer listed above or a distinct primer to
exon 2 sequences, 5-TGCTGTTGTTCTGGTCCATG-3
) and hybridized under oil
overnight in 10 µl of hybridization buffer (10 mM Pipes,
pH 7.0; 400 mM NaCl; 1 mM EDTA; 0.05% SDS) at
45 °C. After hybridization, samples received 100 µl of dilution
buffer (10 mM Tris, pH 7.5; 300 mM NaCl; 0.5 mM EDTA); samples were extracted with chloroform and
precipitated with ethanol. Pellets were collected by centrifugation,
washed with 70% ethanol, and resuspended in 100 µl of reaction mix
(50 mM Tris, pH 8.3 at 25 °C; 75 mM KCl; 3 mM MgCl2; 5 mM dithiothreitol; 0.3 mM each of dATP, dCTP, dGTP, dTTP; 0.02 units/µl RNasin)
at 45 °C. Each sample received 200 units of BRL Superscript I RNase
H-free reverse transcriptase, and the incubator bath was adjusted to
50 °C (ramp time 45-50 °C = 10 min). After a 1-h
incubation, each sample received 100 µl of RNase mix (10 mM Tris, pH 7.5; 10 mM EDTA; 5 µg/ml RNase A; 1.8 units/µl RNase T1) and was incubated at room temperature for 10 min. Samples received 6.5 µl of proteinase K mix (100 mM
Tris, pH 7.5; 50 mM EDTA; 10% SDS; 1 mg/ml proteinase K;
0.5 mg/ml yeast RNA) and were incubated 10 min at 37 °C. Samples
were diluted with 200 µl of (10 mM Tris, pH 7.5; 300 mM NaCl; 0.5 mM EDTA; 0.2% SDS) and were
extracted with phenol/chloroform and precipitated with ethanol.
Products were resuspended in 75% formamide, 75 °C, and resolved on
denaturing polyacrylamide gels.
cDNAs were synthesized using 1 µg of total
or poly(A)+ mRNA. cDNAs (2% of each reaction) were
amplified using the BglII exon 2 primer described above and
primers specific to exon 1C (5-GGCGGGTATCTGCTGGCGGTTTGGCT-3
) or exon
1D (5
-GGACCATCGCCTCGGCGGAGGTCCT-3
) using Taq polymerase and standard amplification conditions. Amplified products were separated by agarose gel electrophoresis and were visualized by ethidium bromide staining.
We
previously showed that a transcriptional mechanism is primarily
responsible for testis-specific overexpression of TBP mRNA (2). In
that study, probes complementary to sequences within either the 5 or
3
regions of the protein coding sequences of TBP mRNA gave
quantitatively indistinguishable results in all assays. Thus, we had no
a priori reason to suspect that testis TBP mRNA differed
qualitatively from somatic TBP mRNA. Recently, a cDNA clone
from mouse brain revealed that the gene contains an additional upstream
non-protein-coding exon (designated exon 1C; 6). Exon 1C is located
2888 base pairs upstream of the first protein-coding exon (designated
exon 2). We initiated an analysis of the use of this promoter/exon in
testis.
RNase protection assays using a genomic probe spanning sequences from
roughly 200 bases upstream of exon 1C to 35 bases downstream of exon 1C
(designated Bgl2/StuI probe, Fig.
1) revealed two clusters of protected fragments around
80 and 100 bases in length (Fig. 1A, clusters designated
1 and 2). Higher resolution conditions revealed
that the same pattern was reiterated by fragments 35 bases longer (105 and 135 bases in length, designated 1* and 2*) which were roughly 5-fold less abundant (Fig. 1B). Similar
patterns of protected fragments were observed in rat testis (Fig.
1A), mouse testis, and mouse spleen (Fig. 1B). We
interpreted this pattern as representing two clusters of initiation
sites represented in mRNA (clusters 1 and 2)
and in pre-mRNA (clusters 1* and 2*); however, because the probe was uniformly labeled and spanned both the
5 and 3
ends of the exon, we could not be certain that the heterogeneity did not occur at the 3
end of the exon. Therefore, we
designed a second probe (designated Sau3A1/StuI
probe) which was truncated just downstream of the predicted clustered
initiation sites. This probe should give a precise 5
end of the
protected fragment and thus allow us to test for heterogeneity at the
3
end of the exon. The results using this probe confirmed a precise 3
end of the protected fragment (Fig. 1C) and, thus by
inference, confirmed the interpretation that the multiple bands
observed in Fig. 1, A and B, represents
heterogeneity at the 5
end of the mRNA.
Using the Sau3AI/StuI probe (Fig. 1C) or the BglII/StuI probe (not shown), the protected fragments that were 35 bases longer than the mRNA fragments (designated 1* and 2* in Fig. 1, A and B) were found to be over-represented in nuclear RNA preparations (Fig. 1C) and under-represented in polyadenylated mRNA preparations (not shown), as compared with the mRNA-specific signals. This corroborates the identity of these species as pre-mRNAs.
A clone of the entire genomic 5 end region was transcribed in
vitro to produce a synthetic pseudo-pre-mRNA. Using a standard curve generated with this pseudo-pre-mRNA, we determined that adult
rat testis had 5-10 amol of exon 1C-containing mRNA, and 1 amol of
exon 1C-containing pre-mRNA, per µg of total RNA (Fig. 1C). Moreover, we found that testis had only 10-fold more
exon 1C-containing TBP mRNA per mass of total RNA than spleen (Fig. 1B). These quantitative values were unexpected because
previously, using a probe complementary to sequences in exon 8, we had
determined that rat testis contains between 20 and 40 amol of TBP
mRNA per µg of total RNA, and spleen contains much less than 1 amol/µg of total RNA (2). Therefore, our data suggested that only
about 30% of the TBP mRNA in testis, albeit possibly all TBP
mRNA in spleen, contains exon 1C.
Sequences that always coexist on the same mRNA molecule will show a
1:1 stoichiometry; molar quantities of exons 1C and 8 in testis exhibit
a 1:3 stoichiometry. This indicated that there must be at least two
species of TBP mRNA in testis (those with exon 1C and those
without); however, the point of divergence could lie anywhere between
exons 1C and 8. To more precisely determine which sequences differed
between TBP mRNA species in testis, the molar quantities of exon
1C- and exon 2- and 3-containing TBP mRNA in testis were determined
by RNase protection (Fig. 2). The results showed that
TBP mRNAs containing exons 2 and 3, like those containing exon 8, were 3-fold more abundant than mRNAs containing exon 1C in testis.
This suggests that all testis TBP mRNAs might contain exons 2, 3, and 8, but they are divergent at their 5 ends. We wished to determine
the 5
sequences of the 70% of testis TBP mRNAs that lack exon 1C.
Because we could detect no pre-mRNA signals containing sequences
upstream of exon 1C using the BglII/StuI probe
(Fig. 1, A and B), we predicted that very little
transcription initiated upstream of exon 1C. Thus we focused our
investigation on the region between exons 1C and 2.
Sequences within the First Intron Hybridize to Testis-specific mRNAs, Suggesting the Presence of Additional Testis-specific Exons
A series of four long probes completely encompassing the
region between exons 1C and 2 were used to search for evidence of alternative exons (Fig. 3). Due to UTP deprivation
during high specific activity transcription and to radiolysis
thereafter, it is nearly impossible to produce and maintain full-length
high specific activity probes of this size, so we expected to find many
smaller protected fragments on the gels. Therefore experiments were
designed such that true exonic mRNA-derived signals could be
distinguished from probe heterogeneity products based on three criteria. First, products arising from probe heterogeneity should appear upon hybridization to either cellular pre-mRNA or synthetic pseudo-pre-mRNA, whereas signals arising from spliced TBP mRNA should not appear in pseudo-pre-mRNA controls. Second, relative to
pre-mRNA signals, signals corresponding to bona fide mRNA
should be under-represented in nuclear as compared with total RNA
preparations (Fig. 1C). Finally, signals corresponding to
mRNA should be over-represented in polyadenylated mRNA
preparations as compared with signals arising from pre-mRNA.
Using the Sa/X probe (Fig. 3A), which contains 83 bases of exon 1C and 342 bases of sequence downstream of the exon 1C splice site, we were able to validate the method. Exon 1C-containing mRNA appears as an 83-base band which is under-represented in nuclear RNA preparations and is not detected using pseudo-pre-mRNA; unspliced pre-mRNA appears as a 425-base band which is abundant in nuclear RNA preparations and is indistinguishable from the signal obtained with pseudo-pre-mRNA. Interestingly, this probe also revealed a protected fragment about 350 bases long in total and nuclear RNA but not in pseudo-pre-mRNA (Fig. 3A) or poly(A)+ mRNA (not shown). Based on the size and distribution of this fragment, we suspect that it represents intronic sequences that have been excised from exon 1C. This species is present at about 0.2 amol/µg total RNA, which is 1/25 and 1/5 of the concentrations of exon 1C-containing TBP mRNA and pre-mRNA, respectively.
The Sa/X probe also yielded a cluster of protected fragments centered around 200 bases in length that fit the criteria for mRNA. Thus, we predicted that this region likely contains an alternate exon with multiple initiation sites. Indeed, these bands were subsequently found to arise from hybridization to TBP mRNAs containing the various initiation sites for exon 1D (see below).
With the X/A (Fig. 3B) and A/R1 (Fig. 3C) probes we also detected protected fragments that fit the criteria for alternate exons. Thus, within the region between XhoI and ApaI, we found evidence for exonic regions of 152 and 50 bases in length (Fig. 3B). Between ApaI and EcoRI, we found evidence for an exonic region of roughly 55 bases in length (Fig. 3C). The region between EcoRI and exon 2 showed no evidence of containing exonic sequences (not shown).
Cloning TBP mRNA 5Our data indicated that additional promoters and alternate
first exons were used in testis. Since the complexity observed by RNase
protection suggested that there were multiple "missing exons," we
wished to perform an exhaustive search that could recover all possible
mRNA 5 end variants. Therefore, a sensitive ligation-mediated method for amplifying and cloning the 5
ends of cDNAs2
was used to clone TBP cDNA 5
ends using rat testis, mouse testis, and mouse liver RNA preparations. This method is analogous to a
"RACE"; however, it is more efficient at recovering rare cDNA ends. Of 144 clones, 74% contained TBP cDNAs. Restriction
digestion revealed that, although exons 1C and 2 contain no
PvuII site, 20% of the positive clones contained an
internal PvuII site (Fig. 4), strongly
suggesting the presence of a novel first exon. Moreover, Southern blots
indicated that only 28% of the clones contained exon 1C (Fig. 4).
The Southern blots revealed a great amount of size complexity in exon 1C-containing clones (Fig. 4). This was taken as further evidence that exon 1C initiates at multiple sites (see below). The size heterogeneity also indicated that the method had recovered a very large number of distinct clones, and thus, the resultant library should have sufficient complexity to allow recovery of clones representing even very rare mRNA species.
Of the 119 clones that were sequenced, 88 contained a TBP cDNA as evidenced by having non-primer-encoded TBP exon 2 sequences (Fig. 4). Six different first exons and several splicing variations were cloned from testis, whereas only two of these variants were cloned from liver. In sum, 10 different mRNA "types" (this does not include different initiation site variants within a cluster; see below) were identified. No species-specific differences in exon or initiation site usage were observed between rat and mouse. Of the six first exons recovered, three (1C, 1D, and 1E) were recovered numerous times; the other three were cloned only once each. However, as these latter three first exons were precisely spliced onto exon 2, we consider them rare, but bona fide, alternate first exons.
From liver, we analyzed 16 TBP-containing clones; 10 contained exon 1C and 6 contained exon 1E (63 and 37%, respectively; see below). From testis, we sequenced 72 TBP-containing clones. Of these, 31 clones (43%) were from mRNAs initiated at exon 1C; 28 clones (39%) were initiated at exon 1D; 10 clones (14%) were initiated at exon 1E; 1 clone (1%) was initiated at exon 1B; and 1 clone (1%) each at 2 different unmapped exons (designated U, clones 97 and 99; see below).
Mapping the tbp Gene Promoters and ExonsTo locate the new exons, we sequenced the entire 5416-base pair genomic region from the SacI site 700 base pairs upstream of exon 1C to the PstI site in exon 3. Within this region, we were able to map four of the six first exons (exons 1B-1E) and one alternate intervening exon (see below). Two of the three rare first exons were not found and almost certainly lie further upstream than we have sequenced. Since these exons were not mapped on the gene, it is possible that each is composed of more than one exon. Because these exons, as well as exon 1B, represent less than 2% each of total TBP mRNA in testis (data not shown), we have not investigated them further.
Relative Contribution of TBP mRNA Variants to Total Liver and Testis TBP mRNA PoolsRNase protection probes specific for
exons 1B, 1C, 1D, and 1E were used to quantitate the relative abundance
of mRNAs containing each exon in liver and testis. Liver was chosen
as the representative somatic tissue specifically because, of all the
somatic tissues analyzed to date, liver contains the most TBP mRNA
per cell (2). Thus, the probability of detecting rare TBP mRNAs is
higher in liver than in other somatic tissues. In liver, only the probe to exon 1C gave a detectable signal (Fig.
5A). This was unexpected, as 37% of the
cDNAs that we cloned from liver contained exon 1E. We are not sure
why exon 1E-containing cDNAs were recovered from liver; we suspect
that a stochastic bias might have favored production, amplification, or
cloning of rare exon 1E-containing cDNAs from liver. In testis,
probes to exons 1C, 1D, and 1E all gave strong signals (Fig.
5A). Quantitative comparisons of the amounts of TBP
mRNAs containing exon 1C, exon 1D, or exon 1E indicated that, in
adult testis, all three of these alternate first exons are similarly
abundant (data not shown). mRNAs containing exon 1B were too rare
to detect in either testis or liver (less than 0.1 amol/µg RNA; not
shown).
Quantitative comparison of liver RNA to serial dilutions of testis RNA by RNase protection revealed the relative testis specificity of each exon type (Fig. 5A). A probe to exons 2 and 3 confirmed that overall TBP mRNA levels are 25-fold higher per equal DNA equivalent of tissue in testis (top panel). mRNA containing exon 1C was only 4-fold more abundant in testis than in liver. In contrast, although mRNAs containing exons 1D or 1E could not be detected in liver, dilutions of testis RNA indicated that our assay had the ability to detect 1/125 of the testis-specific signal. Thus, up-regulation of mRNAs containing exons 1D or 1E in testis is greater than 125-fold in magnitude. Similar RNase protection comparisons for all three first exons in testis, liver, brain, lung, and thymus confirmed that exons 1D and 1E were testis-specific (not shown). In a final attempt to detect exon 1D-containing TBP mRNA in liver, we used a reverse transcriptase-mediated polymerase chain reaction (RT-PCR) assay (Fig. 5B). The results showed that, whereas exon 1C/2 mRNAs could be detected in liver- and testis-polyadenylated mRNA samples and in liver, testis, and brain total RNA samples, exon 1D/2 mRNAs could only be detected in the testis samples. In conclusion, exons 1D and 1E appear to be truly testis-specific, whereas exon 1C is used in all tissues and is up-regulated modestly (4-fold above liver or 20-fold above thymus levels) in testis.
tbp Initiation Sites and Promoter SequencesThe
ligation-mediated cDNA amplification method allows precise
identification of the 3 nucleotide of the first-strand cDNA synthesis product.2 Thus, assuming that the template RNA
was intact and the reverse transcriptase was processive, the sequence
of the cDNA clone allows precise identification of the initiation
nucleotide. The initiation sites thus identified on 5 clones containing
exon 1C confirmed that this exon initiates at two clusters of sites as
suggested by RNase protection (Fig. 1). Exons 1D and 1E were also found to initiate at multiple sites. For these two exons, two zones of
initiation separated by 130 or 55 bases, respectively, were identified
(Fig. 6 and see below).
Although the sequence of the cDNA 5 end clones presented above
allowed precise identification of the transcription initiation sites of
individual mRNAs, it was possible that, after amplification and
cloning, individual clones might have become either over- or
under-represented in the population. Thus, we wished to confirm the
relative frequencies of initiation site usage by direct
primer-extension analysis. Primer extension on
poly(A)+-selected rat testis mRNA using either of two
primers specific for sequences in exon 2 confirmed the initiation sites
(Fig. 7). Samples containing
poly(A)+-selected liver RNA confirmed that the signals were
testis-enriched, as expected for TBP mRNA-specific signals. The
sizes of the individual products corresponded to those predicted from
the cDNA 5
end clones in Fig. 4 and to the sizes of RNase
protection products mapped in Figs. 1 and 3. The start sites are
diagrammed in Fig. 8.
The regions upstream of all of the initiator exons are TATA-less, which
is consistent with their having clustered initiation sites (8). Roughly
35 bases upstream from the major site of internal initiation for exon
1E is the sequence TATAT, which bears some resemblance to a TATA box
(Fig. 8) (9). However, the imprecise initiation noted in this region is
reminiscent of TATA-less promoters, and thus it is likely that this
sequence is not sufficient to direct transcription initiation to a
single nucleotide. Comparison of the entire tbp 5 region
with published data bases revealed no striking similarities (excluding
repetitive sequences) with other published sequences. Indeed, with the
exception of an SP1 binding site in the exon 1C promoter (6), the
sequences upstream of all of the TBP initiation sites bear little
obvious resemblance to previously identified promoters (see below).
TBP protein is required for all nuclear transcription initiation and, thus, is a fundamental component of all cells. Unexpectedly for a gene with such ubiquitous and apparently well defined functions, the tbp gene is highly overexpressed in the early haploid stages of spermatogenesis. Because spermatogenesis is a very complex process that involves interactions between many cell types (10), we suspect that a full understanding of the roles of TBP in spermatogenesis can only come through manipulation of TBP expression in animals. Identification of the spermatid-specific regulatory mechanisms for the tbp gene is a requisite step on the path to manipulating TBP expression in transgenic animals and, ultimately, to understanding the roles of TBP overexpression in spermatogenesis.
A molecular analysis of the 5 end of the tbp gene is
presented. Our results show that liver uses almost exclusively a single promoter/first exon and produces predominantly a single species of TBP
mRNA. In contrast, testis initiates transcription at no fewer than
3 major and 3 minor first exons and produces at least 10 different TBP
mRNA types (6 abundant and 4 rare). This work precisely localizes
the testis-specific transcription initiation sites and testis-specific
exons on the tbp gene. This indicates where to focus a
search for the spermatid-specific regulatory mechanisms, and more
importantly, it indicates which sequences to target in a rational
mutagenesis of putative spermatid-specific regions of the
tbp gene in transgenic animals.
What is the purpose for the
testis-specific TBP mRNA heterogeneity? One possibility was that
the different mRNAs might generate distinct protein products.
However, only mRNAs initiated at the more upstream region of exon
1E have a translational start codon that could produce an alternate
protein (72 amino acids long, not in the TBP reading frame), and the
ATG for this polypeptide is in a poor context for initiation (11).
Thus, we predict that all 10 types of TBP mRNA characterized here
only give rise to normal TBP protein. A similar situation has been
reported for the cytochrome c gene, which produces a
testis-specific mRNA that is predicted to use the same open reading
frame as somatic cytochrome c mRNA (12). It is possible
that the upstream ATGs noted on some of the TBP mRNA 5 end
variants might play a role in translational regulation (see below).
A second possibility is that each individual promoter is favored by a specific subset of nuclei types. Indeed, somatic tissue nuclei appear to use exclusively exon 1C for initiation. It was possible, for example, that exons 1C, 1D, and 1E were each preferred by germ cells at a specific developmental stage. However, we have analyzed the developmental onset of accumulation of mRNAs containing exons 1C, 1D, and 1E in testis, and we cannot detect differential temporal accumulation of any of these mRNAs (not shown).
A third possible reason for generating different mRNA species might
be to allow different post-transcriptional regulation of TBP protein
accumulation. Recently, a study on the copper-zinc superoxide
dismutase gene showed that, like for tbp,
testis-specific expression involves recruitment of two testis-specific
promoters in addition to the somatic promoter (13). Of the three
different mRNA types that accumulate in testis, one (arising from
one of the testis-specific promoters) is sequestered as
ribonucleoprotein particles; the other two are predominantly polysomal
(13). A fraction of the TBP mRNA in whole testis is also
sequestered as ribonucleoprotein particles (not shown). Presumably,
these stored mRNAs are translated at a later time. Different
sequences in the 5-nontranslated leader of TBP mRNAs resulting
from different first exon usage, possibly including the presence of
upstream ATGs, might target mRNAs to be either translated
immediately or stored for later use.3
The numerous splicing variations that arise in testis are another curiosity. We can currently find no rational explanation for why transcripts initiating at the upstream region of exon 1D must use two splice donor sites (positions +68 and +179 in Fig. 8). Spermatids might simply provide a "sloppy" or "promiscuous" splicing environment, such that otherwise cryptic splice donor and acceptor sites can be used. Indeed, numerous genes have been shown to exhibit splicing patterns in testis that are not found in somatic tissues (e.g. Refs. 14-16). In the case of TBP mRNA, where all of the alternatively spliced mRNAs are expected to produce the same protein, such alternate splicing would not be deleterious.
Testis-specific Transcription of the tbp GeneComparison to
public data bases indicates that the tbp gene 5 end
exhibits no notable sequence similarities to previously described
spermatid-specific promoters. A search for putative transcription
factor recognition sequences4 revealed
little evidence of what factors might be regulating this gene. Thus,
although the entire 5
region of tbp is peppered with
putative recognition sites for the "testis-determining factor," SRY
(frequency, 1 site per 168 bases over 5416 base pairs; 17), a similar
frequency (1 site in 281 bases) was found in the 111,400-base pair
0-2.4-min region of the Escherichia coli genome. Thus, the concentration of putative SRY sites in the tbp gene does not
appear to differ significantly from that found in an arbitrary and
physiologically irrelevant sequence.
Putative binding sites were noted for two other spermatid-enriched transcription factors, a site for the SRY-related Sox-5 protein (18) and several putative cAMP-response elements (CREs; 19) which are the binding sites for CREM and CREB. The one putative Sox-5 site is located roughly 850 base pairs downstream of the exon 1E promoter. Five putative CREs are found as follows: one roughly 165 base pairs upstream of exon 1B, one at the major point of initiation for exon 1C, one roughly 220 base pairs, and one roughly 35 base pairs upstream of exon 1E, and one roughly 200 base pairs downstream of the promoter of exon 1E. CREB is expressed in many testis cell types, whereas CREM is predominantly restricted to the germ cells (20). Two recent reports show that CREM-deficient mice cannot complete spermatogenesis (21, 22). However, in one of these reports, TBP mRNA was used as a control for an RNase protection experiment, and its expression did not appear diminished in the CREM-deficient mice (22). Thus, it appears that the major CRE-binding protein in spermatids is dispensable for TBP overexpression. Binding sites for other putative spermatid-specific transcription factors such as Tet-1 (23) or Zfy-1 (24) have not yet been selected from random sequences, and thus remain largely undefined. It remains possible that, despite a lack of obvious sequence identity, the tbp gene might share testis-specific regulatory signals with other spermatid-specific genes. Accurate delineation of what cis-regulatory sequences are important for testis-specific TBP overexpression will require a functional analysis of these sequences in transgenic animals.
A final point to consider is the reason for the rare promoter/first exons in testis. The mRNAs arising from these exons in testis are at least 10-fold less abundant than TBP mRNA in somatic cells and are predicted to yield the exact same protein product. Thus, we suspect that these transcripts do not have a unique function. Rather, they might result from promiscuous transcription initiation in spermatids. Spermatids contain greatly elevated levels of all measured components of the basal RNA polymerase II transcription machinery (2).3 A model has been proposed for how increased levels of the transcription machinery should decrease promoter stringency and thus promote transcription initiation at sequences that would otherwise not be recognized as promoters (25). This model might explain many cases of spermatid-specific gene expression, including the rare TBP mRNAs.
In summary, testis-specific up-regulation of the tbp gene involves recruitment of two very strong testis-specific promoter/first exons. This work will be important for further resolving the signals regulating spermatid-specific TBP expression and ultimately for understanding the reason why spermatids contain 1000-fold more TBP mRNA molecules per cell than do somatic cell types.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) D86619[GenBank].
We thank P. Fonjallaz for assistance with PCR, D. Lavery for critically reading the manuscript, and N. Roggli for photography.