(Received for publication, December 14, 1994)
From the
To locate in detail the regions in the human androgen receptor
(AR) involved in transcription activation, a series of N-terminal
deletions was introduced in the wild type AR and in a constitutively
active AR. The different constructs were tested for their capacity to
activate transcription. Almost the entire N-terminal domain (residues
1-485) was necessary for full wild type AR activity when
cotransfected with the (GRE)tkCAT reporter in HeLa cells.
In contrast, a smaller part of the N-terminal domain (amino acids
360-528) was sufficient for the constitutively active AR to
induce transcription of the same (GRE)
tkCAT reporter in
HeLa cells. This demonstrates the capacity of the AR to use different
regions in the N-terminal domain as transcription activation units
(TAUs). To obtain additional information of AR N-terminal TAUs, the
GAL4 DNA binding domain was linked to either the entire or parts of the
AR N-terminal domain and cotransfected with the (UAS)
tkCAT
reporter in HeLa cells. The results confirmed that the first 485 amino
acid residues accommodate a transcription activation function. When the
chimeric AR-GAL4 constructs were tested on a different reporter
((UAS)
E1bCAT), a small shift in position of the TAU,
responsible for full transcription activation, was observed. The data
presented show that the size and location of the active TAU in the
human AR is variable, being dependent on the promoter context and the
presence or absence of the ligand binding domain.
Steroid hormone receptors constitute one of the best available
model systems for studying regulation of gene transcription. Based upon
phylogenetic studies, the family of nuclear receptors can be grouped
into three subfamilies: 1) steroid receptors to which also the androgen
receptor (AR) ()belongs, 2) thyroid hormone and retinoid
receptors, and 3) orphan receptors that lack a well defined
ligand(1) . Upon testosterone or dihydrotestosterone binding,
the AR undergoes several sequential processes to interact with cognate
DNA sequences. These DNA sequences (hormone response elements (HREs))
are commonly located in the regulating regions of the target genes. The
binding of the AR to the HRE results in the formation of a stable
preinitiation complex near the transcription start site, which allows
efficient transcription initiation by RNA polymerase II. The mechanism
by which steroid receptors stabilize or interact with the preinitiation
complex is poorly understood. There is experimental evidence for a
direct interaction of receptor with the general transcription factor
IIB(2, 3) . Furthermore, steroid receptors could
indirectly associate with the preinitiation complex via so called
bridging factors (co-activators) or could make promoters accessible for
other transcription activators by nucleosome displacement (reviewed in (4, 5, 6, 7, 8) ).
All nuclear receptors are composed of at least four functional domains: the N-terminal domain, DNA binding domain (DBD), hinge region, and the ligand binding domain(9, 10, 11) . The DNA binding domain, composed of two ``zinc-finger'' structures, is, like the C-terminal ligand binding domain, highly conserved among the steroid receptors(1) . Interestingly, although the N-terminal domain of all steroid receptors harbors a transcription activation function, its length and amino acid residue constitution is unique for each receptor(12, 13, 14, 15, 16, 17, 18) . The sizes of the N-terminal domain of the androgen, glucocorticoid (GR), progesterone (PR), and mineralocorticoid receptors (MR) are approximately half of the total receptor size. Compared with other nuclear receptors, this is exceptionally large, and it coincides with the observation that the AR, GR, PR, and MR are the only nuclear receptors recognizing HREs that fit consensus glucocorticoid response elements (GREs)(1, 19, 20) . The specific GRE binding is reflected by the presence of a glycine, serine, and valine residue in the so called P-box located in the first zinc finger of the DNA binding domain (21, 22, 23) . Based on the P-box sequence, the AR, GR, PR, and MR are classified in the ``GSckV group''(1, 19) . The residues in the P-box determine which type of HRE half-site (consensus GRE half-site, TGTTCT or estrogen response element half-site, TGACCT) the receptor recognizes (21, 22, 23, 24) . Almost all members of the thyroid hormone, retinoid receptor, and orphan receptor subfamilies bind HREs containing estrogen response element-like half-sites and are classified in the ``EGckA/EGckG/EGckS'' P-box groups(1, 19) . In these subfamilies, different mechanisms have been described that enlarge diversity and could explain, at least partially, cell and promoter discrimination: receptor heterodimerization, variable spacing of the HRE half-sites, and direct repeat or inverted repeat orientation of the two HRE half-sites(25, 26, 27, 28, 29, 30) . Since these variabilities have not been found for the ``GSckV class'' receptors, it seems reasonable to assume that the large and unique N-terminal domains of the AR, GR, PR, and MR are important in cell- and receptor-specific regulation of target genes through multiple N-terminal TAUs (also referred to as TAF, AF, or TAD). For some promoters, experimental evidence has been provided, showing that promoter context as well as receptor N-terminal domain could determine receptor specificity(31, 32) .
Several different sequence motifs that characterize transcription activation units have been identified thus far, including acidic regions (acidic activation domains), proline-rich domains, and glutamine-rich domains (reviewed in (5) and (33) -35). Although the AR N-terminal domain does not possess a significant sequence identity with other known transcription factors, it is glutamine and proline rich (including homopolymeric Gln and Pro stretches) and has a relatively high number of acidic amino acids(36) .
To investigate the size and location of the TAU responsible for the transactivating capacity of the human AR in different situations, a series of N-terminal deletions was introduced in the wild type AR and in a constitutively active AR. These AR mutants were tested for their ability to activate transcription of an androgen-responsive reporter gene. In addition, chimeric AR-GAL4 DNA binding domain constructs were generated to obtain additional evidence for the localization of TAU regions.
Transfections of complete series of AR mutants were performed five times in duplicate, using at least two independent plasmid extracts. In each experiment, CAT activities were corrected for the reporter background, and the percentage relative to that of the total number of counts in the assay was determined for each mutant. The means (±S.E.) were calculated, and the starting mutants (AR0, AR5, or AR4G) were set to 100%.
The following primers introducing the insertion or deletion were used: AR3, 5`CAAGCTCAAGGATGGAATCTAGATCGATACGCGTGCAGTTAGGGCTG-3`; AR9, 5`-AGGCACCCAGAGGCCGCATCGATCACAGGCTACCTGGTC-3`; AR10, 5`-CCAAGCCCATCGTAGATCGATGCCGCAGCAGCTGCCA-3`; AR11, 5`-GAGCCGCCGTGGCCGCATCGATGCAACTCCTTCAGC-3`; AR19, 5`-ACATCCTGAGCGAGGCATCGATGGGCCTGGGTGTGG-3`; AR34, 5`-GCTGAAGAAACTTGGATCGATTGAAGGCTATGAATG-3`; AR53, 5`-CTTTCCACCCCAGAGATCGATTGATAAATTCGGA-3`; and AR62, 5`-GCTGGCGGGCCAGGAATCGATGCGTTTGGAGACTG-3`.
All of these AR mutants were sequenced to verify the correct reading frame before they were used as starting constructs for the following AR mutants. pAR110 was constructed by removing the 2.8-kilobase ClaI-BamHI fragment, containing most of the AR cDNA, from pAR3 and insertion of the 2.5-kilobase ClaI-BamHI fragment from pAR9 into pAR3. In the same way, pAR120 (combination of pAR3 and pAR10), pAR121 (combination of pAR3 and pAR11), pAR123 (combination of pAR3 and pAR22), pAR104 (combination of pAR3 and pAR62), pAR60 (combination of pAR22 and pAR62), and pAR55 (combination of pAR9 and pAR19) were constructed. pAR113 was constructed by removal of the internal fragment of pAR62 digested with RsrII and ClaI. Ligation of the blunt-ended plasmid resulted in an in-frame deletion. pAR130 and pAR131 were constructed by combining the deletions of pAR110 and pAR62, and pAR110 and pAR113, respectively.
pAR5 was constructed by ligation of a XbaI linker (Promega) that contains an in-frame stop codon into the blunt-ended ClaI site of pAR34. pAR124, pAR127, pAR106, pAR99, pAR126, pAR98, pAR100, pAR115, pAR117, and pAR132 were constructed by introduction of the deletion present in pAR120, pAR121, pAR123, pAR62, pAR113, pAR60, pAR55, pAR22, pAR61, and pAR104, respectively, into pAR5. pAR105 was constructed by combining the pAR3 and pAR100 mutants as described above. pAR128 was constructed by combining the deletions of pAR106 and pAR99.
pAR0G was constructed by insertion of the ClaI-digested polymerase chain reaction fragment encoding the GAL4 DNA binding domain (GDBD, amino acids 1-147) into the ClaI-digested pAR53 vector. The orientation and sequence of the polymerase chain reaction insert was checked. The following primers were used on the pSG424 plasmid (41) template to obtain the correct GAL4 DBD fragment (GDBD-A (5`-primer), 5`-CAAGCCTCCTGATCGATGAAGCTACTG-3` and GDBD-B (3`-primer), 5`-CCCGGGAATTCCATCGATACAGTCAAC-3`).
pAR4G was constructed by
ligation of a XbaI linker that contains an in-frame stop codon
into the blunt-ended ClaI site of pAR0G (multiplied in E.
coli DH5). Please notice that pAR0G contains two ClaI sites but that only the ClaI site 5` of the GDBD
insert is dam methylated. pAR94G, pAR107G, pAR105G, pAR106G, pAR99G,
pAR113G, pAR98G, pAR100G, pAR115G, and pAR117G were constructed by
introduction of the deletion present in pAR124, pAR127, pAR105, pAR106,
pAR99, pAR126, pAR98, pAR100, pAR115, or pAR117, respectively, into
pAR4G. pAR96G, pAR91G, pAR85G, and pAR84G were constructed by combining
the deletions of pAR94G and pAR99G, pAR107G and pAR99G, pAR94G and
pAR113G, and pAR107G and pAR113G, respectively.
All mutants were expressed in COS-1 cells, and the AR proteins were analyzed by Western blotting(16) . Using the antibodies SP197, SP061, SP066, F52.24.4, F39.4.1(42, 43, 44, 45) , and 2GV3/3GV2 (46) all mutants except AR106 could be visualized. The expression levels of the various mutants were comparable. AR106 could only be detected using the F52.24.4 antibody. This monoclonal antibody is not specific enough to use directly for development of Western blots loaded with whole cell lysates, since it recognizes many zinc finger-containing proteins. We were unable to detect AR106 after immunoprecipitation, most likely because AR106 (calculated molecular mass is approximately 28 kDa) comigrated with the light chain of the F52.24.4 antibody.
Additionally, the key mutants pAR0, pAR5, pAR0G, and pAR4G were checked for nuclear localization(47) .
The pG29GtkCAT ((GRE)tkCAT)
reporter plasmid (48) contains two
progesterone/glucocorticoid-responsive elements in front of the
thymidine kinase (tk) promoter linked to the CAT gene. The
(UAS)
tkCAT reporter (kindly provided by Dr. R. Renkawitz
and Dr. M. Muller) is in principle the same except that the two
progesterone/glucocorticoid-responsive elements are exchanged by two
upstream-activating sequences (UAS) to which the GAL4 DNA binding
domain can bind(49) . The G5E1bCAT reporter contains 5 UASs in
front of the E1b TATA box linked to the CAT gene(50) .
Figure 1:
Functional analysis of
N-terminal deletion mutants of the wild type AR0. Transcriptional
activity was examined by cotransfection of AR expression plasmids and a
(GRE)tkCAT reporter plasmid. CAT activity was determined
from cell lysates of transfected HeLa cells cultured in the absence or
presence of 1 nM R1881. Activities were corrected for the
(GRE)
tkCAT background, and the mean (±S.E.) of the
R1881-treated samples of five independent assays are presented as a
percentage relative to that of the wild type
AR0.
To locate the minimal part of the N-terminal domain that still retained most of the transactivation capacity, AR130 and AR131 were constructed. Although both mutants were quite capable of inducing transcription (approximately 65% compared with the wild type AR0), the only AR mutant retaining full transcription activity was AR62, which again indicates that almost the entire AR N terminus is necessary for wild type receptor activity (residues 1-485, designated TAU-1).
Surprisingly, in contrast to the results found in the AR0 deletion analysis, truncation of the first 360 amino acids in the constitutively active AR (AR106) did not result in a decrease in the transactivating capacity, indicating a TAU in the 360-528 region (Fig. 2). All other deletion mutants confirm the presence of a TAU in this area (designated TAU-5). Deletions of part of the 360-528 domain resulted in complete loss of or decreased transactivating capacity (AR99, AR126, AR98, and AR117). In contrast, in the mutants AR100, AR115, AR124, AR127, and AR105, in which the 360-528 region is present, complete (or almost complete) transactivating capacity was observed. So, the smallest N-terminal region responsible for full AR5 transcription activity is located between residues 360 and 528.
Figure 2:
Functional analysis of N-terminal deletion
mutants of the constitutively active AR5. Transcriptional activity was
examined by cotransfection of AR expression plasmids and a
(GRE)tkCAT reporter plasmid. CAT activity was determined
from cell lysates of transfected HeLa cells cultured in the absence of
R1881. Activities were corrected for the (GRE)
tkCAT
background, and the mean (±S.E.) of five independent assays are
presented as a percentage relative to that of AR5. AR106 is the only
receptor mutant that could not be visualized by Western
blotting.
Figure 3:
Functional analysis of N-terminal deletion
mutants of the chimeric AR4G. Transcriptional activity was examined by
cotransfection of AR-GAL4 expression plasmids and a
(UAS)E1bCAT (emptybars) or
(UAS)
tkCAT (filledbars) reporter
plasmid. CAT activity was determined from cell lysates of transfected
HeLa cells cultured in the absence of R1881. Activities were corrected
for the reporter background, and the mean (±S.E.) of five
independent assays are presented as a percentage relative to that of
AR4G.
The
same series of chimeric proteins was tested on the
(UAS)E1bCAT construct (Fig. 3, emptybars). This reporter contains five UASs upstream of the
E1bTATA box and represents a different promoter
environment(50) . The transcription activation capacities of
the chimeric proteins were different when tested on the
(UAS)
E1bCAT reporter. A large reduction in transactivating
capacity was observed between mutants AR94G (
1-142) and
AR107G (
1-188), which shows that the region between amino
acids 142-188 is essential for the chimeric proteins to be able
to activate transcription of the (UAS)
E1bCAT reporter. In
contrast, the capacities of AR94G and AR107G to activate transcription
of the (UAS)
tkCAT reporter were not significantly
different. This promoter specificity, determined by AR amino acids
142-188, is also obvious for mutants AR100G and AR91G (Fig. 3). On the other hand, the region between amino acids 360
and 550 is more important for the chimeric proteins to be able to
activate transcription of the (UAS)
tkCAT construct. AR106G,
which only contains this last part of the AR N-terminal domain, was
incapable of inducing transcription when cotransfected with the
(UAS)
E1bCAT construct but retained approximately 35%
activity when tested on the (UAS)
tkCAT reporter. When
tested on the (UAS)
E1bCAT reporter, the minimal part of the
AR N-terminal domain still capable of inducing full transcription
activity is located between residues 142 and 485 (AR96G).
To dissect the transcription activation properties of the
human AR, a series of AR deletion mutants was analyzed to characterize
and locate N-terminal regions essential for transcription activity.
Deletion mapping of the wild type AR revealed that for full receptor
activity, almost the entire N-terminal domain is necessary (TAU-1,
residues 1-485) (Fig. 4). Any deletion, except for the
deletion of the last 42 residues in the AR N terminus, affected the
capacity to induce transcription when analyzed in HeLa cells using the
(GRE)tkCAT reporter. The minimal region (designated core
region) that still retained over 50% transcription activity compared
with AR0, was located between residues 101 and 370 (Fig. 4).
Interestingly, these results are clearly different from the data
obtained in the deletion mapping studies of the constitutively active
AR5. The same series of deletions was introduced in AR5, and the
respective mutants were analyzed under identical experimental
conditions (HeLa cells using the (GRE)
tkCAT reporter).
These studies revealed that a region between amino acids 360 and 528
(designated TAU-5) was sufficient for full constitutive transcription
activity (Fig. 4). This demonstrates the capability of the AR to
use different N-terminal regions for transcription activation.
Furthermore, these observations show a determinant role for the ligand
binding domain in TAU functioning. Deletion of the ligand binding
domain resulted in loss of TAU-1 activity and induced the use of TAU-5.
These findings might indicate a functional interaction between the
ligand binding domain and the AR N-terminal domain. Such an interaction
has already been suggested by McPhaul and co-workers(51) . They
examined a mutant AR, which contained two structural alterations: a
shortened N-terminal glutamine stretch and a tyrosine to cysteine
substitution in the ligand binding domain(51) . Interestingly,
the ability of the AR to activate transcription was strongly diminished
only when both alterations were present, indicating a potential
interaction between the two different domains. The observation that
deletion of the ligand binding domain induces the use of TAU-5,
however, does not permit one to draw conclusions with respect to its
functional in vivo existence, since so far, constitutively
active ARs have not been identified in intact in vivo systems.
Figure 4: Summary of the regions of the AR N-terminal domain responsible for the transactivating capacity of the wild type AR and AR mutants. Totalbars represent the region necessary for full receptor activity compared with the starting receptor (AR0, AR5, or AR4G). Thickbars, as part of the totalbar, represent the region responsible for 50% or more of the transcription activity (core region).
Although TAU-1 and TAU-5 overlap, the core of the two TAUs (responsible for over 50% of the activity) are separate N-terminal regions with individual characteristics. TAU-1 contains a relatively high number of acidic amino acids, three glutamine repeats of which one is polymorphic, and potential phosphorylation sites(36, 42) . The TAU-5 core is not acidic and harbors three different amino acid stretches: 1) a proline stretch (residues 371-378), 2) an alanine stretch (residues 397-401), and 3) a glycine stretch (residues 448-463)(36) . The role of the different amino acid stretches in the TAU-5 region is not known. Lengthening of the Gln repeat to more than 40 residues is associated with Kennedy's disease, an X-linked neurodegenerative disorder, characterized by a slowly progressing muscle weakness(52) . Mhatre and co-workers (53) have shown that the extension of the Gln repeat to 40 or 50 residues resulted in a decreased AR capacity to activate transcription.
The capability of the AR to use different and unique
regions of its N-terminal domain as TAUs introduces the possibility
that different TAUs are responsible for the regulation of different
genes, resulting in cell-specific and AR-specific gene expression.
Evidence for the use of different N-terminal regions to activate
different kinds of promoters was provided by the analysis of a series
of AR-GAL4 chimeric constructs. These proteins contained either the
entire or part of the AR N-terminal domain linked to the GAL4 DNA
binding domain. When the transactivating capacity was tested on two
different promoters, a shift in position of the TAU responsible for
full activity was observed. The location of the TAU was more C-terminal
when tested on the (UAS)tkCAT construct compared with the
(UAS)
E1bCAT reporter (Fig. 4). It is unlikely that
these two situations represent the use of two different and separate
TAUs but rather indicate a small difference in the location of
essential sequences. Since the AR4G and AR5 constructs only differ in
their DBD, it might be expected that the respective TAUs would be
located in the same part of the N-terminal domain. When AR4G
derivatives were tested on the (UAS)
tkCAT reporter and the
AR5 derivatives on the comparable (GRE)
tkCAT reporter, the
region between amino acids 360-528 was essential for the
transactivating capacity of both AR4G and AR5. However, for full
transactivating capacity, the AR4G needs a larger region (amino acids
188-485), which might indicate that the replacement of the DBD
influenced the size of the N-terminal region used as the TAU.
In contrast to the DNA- and steroid binding domain, it is clear that the large and unique TAU-1 responsible for wild type AR transactivating capacity is not a sharply bordered functional domain. TAUs have been located and characterized in the N-terminal domains of the GR and PR. The N-terminal TAU of the human GR has been delineated to the central part of the GR N-terminal domain (amino acids 77-262). In contrast to our observations, the core unit of this TAU consists of only 40-60 amino acids(54) , and the same region is responsible for the wild type GR and the constitutively active GR functioning(55) . The TAU, responsible for the transactivating capacity of the human PR B form, is located in the last 90 amino acids of the N-terminal domain, and its activity can be modulated by the first N-terminal 164 amino acids(56) . These studies established that the location of the TAU in the various steroid receptors differs and that there is almost no evidence for sequence or structural homology. The sequences characterizing TAUs might represent interfaces that function by direct or indirect binding to general and/or specific transcription factors(2, 5, 33, 34, 35) . However, the mechanism by which TAUs can regulate gene transcription is still largely unknown.