(Received for publication, July 27, 1995)
From the
SATB1 is a nuclear matrix attachment DNA (MAR)-binding protein which is predominantly expressed in thymocytes. This protein binds to the minor groove specifically recognizing an unusual DNA context exhibited by a specific MAR region with strong base-unpairing propensity. A phage library displaying nonamer random peptides without any built-in structure was used to identify a MAR binding motif of SATB1. One predominant cyclic peptide C1 of CRQNWGLEGC selected by a MAR-affinity column showed 50% identity with a segment in SATB1 (amino acids 355-363). Replacement of the C1 similarity segment in SATB1 by a random amino acid sequence or its truncation resulted in more than 80% reduction in MAR binding. In contrast, replacement of the same SATB1 segment with the C1 peptide restored full MAR binding activity and specificity as the wild-type protein. Single amino acid mutation of the conserved Arg or Glu residue to Ala greatly reduced MAR binding. Taken together our data show that a nine amino acid sequence in SATB1 represents a key MAR binding motif. Phage display may provide a general tool for rapid identification of DNA binding peptide motifs.
Eukaryotic chromatin is organized into loop domains, which may
have both structural and functional roles such as differential gene
expression and replication(1, 2, 3) . It is
believed that the chromatin loops are anchored to the nuclear matrix at
specific DNA regions that exhibit high affinity to the nuclear matrix.
These matrix-associated regions or MARs ()(2, 3) are common at the boundaries of
transcription units (4, 5, 6, 7, 8) and often
found near enhancer-like regulatory
sequences(9, 10, 11, 12) . Recent
evidence shows that MARs play a role in tissue-specific gene
expression. MARs associated with the immunoglobulin µ heavy chain
locus have been shown to be essential for transcription of a rearranged
µ gene in transgenic B lymphocytes(13) . Also, MARs are
binding targets for a cell type-specific MAR-binding protein, SATB1.
SATB1 is primarily expressed in thymus(14) , and because it is
expressed during specific stages of T-cell development, (
)it
is believed to be involved in thymocyte maturation and differentiation.
SATB1 is a novel type of DNA-binding protein that recognizes a
specific sequence context in which one strand exclusively consists of
mixed A, T, and C nucleotides (ATC sequences). Clustered ATC sequences
commonly found in MARs have a strong tendency to unwind by extensive
base unpairing(15) . The unwinding property of MARs confers
high affinity binding to the nuclear matrix and is essential for
transcription enhancing activity of such sequences(16) . SATB1
does not bind to sequences that are similarly AT-rich but lack the
unwinding capability. SATB1 binding is highly specific, but exhibits an
unusual mode of DNA recognition; it interacts with the minor grooves of
its target sequences while making little contact with the
bases(14) . A 150-amino acid segment of SATB1 was recently
identified as the MAR binding domain (17) . The concomitant
presence of both NH- and COOH-terminal arms of this segment
is necessary for full binding activity, but the DNA contact sites
remain to be determined.
We have taken an unconventional approach in delineating the DNA binding sites within SATB1 using a random peptide bacteriophage display library. Bacteriophage display is a powerful tool to study protein-target molecule interactions(18, 19) . In this system, hundreds of millions of random peptides are expressed on the surface of bacteriophage as fusion protein libraries, and ligands for various purposes can be selected from them. Peptide ligands for proteins (20, 21, 22, 23, 24) , antibodies(25) , and enzymes (26) have been identified in this manner. We speculated that phage display libraries could also represent a vast source of DNA binding motifs. We report here that a random peptide bacteriophage library without any built-in structures can be used to affinity-select specific MAR binding peptides. Based on the sequence similarities between the selected peptide and a native MAR-binding protein SATB1, we were able to define a nine-amino acid segment in SATB1 as a key MAR binding motif for this protein. Deletion of this segment markedly reduced the MAR binding activity of SATB1. In the context of SATB1, this motif together with other components in SATB1 confers unique binding specificity to the AT-rich sequences with strong unpairing potential.
Figure 2:
Alignment of the C1 phage-derived peptide
to the NH- and COOH-terminal arms of the MAR binding domain
of SATB1. The amino acid positions of the native SATB1 are shown. The
MAR binding domain of mouse SATB1 was defined as a 150-amino acid
region (amino acids 346-495) within the 764-amino acid coding
sequence(17) . Sequence identities and similarities are
indicated by solid and broken rectangles,
respectively. Gaps are indicated by bars. Alignment between
the phage-derived peptides and SATB1 is shown with thick
lines. Alignment restricted between the NH
- and
COOH-terminal arms of the native SATB1 protein is shown with thin
lines. Since the two cysteines in peptide displayed on the phage
are likely to play a structural role (see text), they are not included
in the alignment.
A library containing about 10 independent clones
was constructed by inserting CX
peptides (C:
cysteine; X: any amino acid) into the pIII protein of the
fUSE5 vector(18, 19) . The inclusion of the cysteine
was intended to facilitate the selection of peptides containing
cysteine pairs, should the binding activity studied require a cyclic
structure. Such peptides can potentially cyclize and improve the
affinities for target binding (28, 31) . Phage
particles were incubated with MAR covalently coupled to Sepharose (27) under conditions developed for purifying DNA-binding
proteins(29, 30) . The DNA consisted of concatemerized
repeats of a 25-base pair SATB1 recognition sequence
(5`-TCTTTAATTTCTAATATATTTAGAA-3`), which is derived from the core
unwinding element of the MAR downstream of the mouse immunoglobulin
heavy chain enhancer(14) . After washing, the bound phage were
eluted with a KCl gradient in DNA binding buffer, amplified, and
reapplied to the column for further selection. Following five rounds of
such selection, the retention efficiencies on the MAR column, measured
as transducing units (TU) of eluted phage, were increased over 300
times compared with plain Sepharose or Sepharose coupled with
nonspecific herring sperm DNA. Concomitantly, with each successive
round of selection the elution profile shifted toward higher salt
concentrations (Fig. 1), reflecting the enrichment of phage with
high affinity. In the sixth and seventh rounds of panning, the KCl
concentration in the incubation buffer was raised from 0.1 to 0.4 M and 0.7 M respectively; this led to a further shift in
the elution profile with most of the phage eluting at 2 M KCl (Fig. 1).
Figure 1:
Shift in
the phage elution profile following multiple rounds of selection on a
MAR affinity column. Results from the first (), fourth
(
),and seventh (
) round of selection are shown. Phage from
a nonamer random peptide display library were bound to a MAR DNA
affinity column as described under ``Experimental
Procedures'' and eluted with buffers containing the indicated
final concentrations of KCl. The yield was estimated from the number of
tetracycline-resistant colonies or TU following infection of K91Kan
strain of E. coli cells; the y axis shows the yield
from each fraction expressed as the percent of total TU
recovered.
Table 1lists amino acid sequences displayed
on these phage and their MAR binding activity determined by a filter
binding assay. After seven rounds of selection, a single peptide
CRQNWGLEGC (C1) predominated by over 60% (26/42) of all the phage
clones sequenced and exhibited strong MAR binding activity. Sequence
alignment revealed a similarity of C1 with SATB1 at the
NH-terminal (four identities and one similarity among 8
residues) and COOH-terminal arms (two identities and one similarity
among 8 residues) of the MAR binding domain ( Fig. 2and (17) ). Two other clones (C2 and C3) were similar to C1 and
SATB1 in that they both contained charged residues at the conserved
positions. Clone 4 (C4) was identified four times and showed some
similarities to the NH
-terminal arm of SATB1 adjacent to C1
similarity region, but not to the COOH-terminal arm (not shown). The
remaining clones, including clone 5 (C5) which was repeated three
times, did not show discernible similarities to SATB1. Except for C1
and C4, all other clones showed low levels of binding but still above
those of control phage.
All the clones had a second cysteine, suggesting a preference for cyclized peptides. The conformational constraint provided by the disulfide bond appeared to be important for the binding, as reduction of the bond greatly decreased MAR binding by C1 (Fig. 3A) and by other clones (not shown) in the filter binding assay. To test the specificity of MAR binding by the predominant C1 clone, an unrelated DNA sequence was used as a probe, and no binding was detected (Fig. 3B). A 100-fold molar excess of unlabeled wild-type MAR completely inhibited C1 binding, whereas nonspecific herring sperm DNA had no effect even at much higher concentrations (Fig. 3C). However, C1 binding was also inhibited by unlabeled mutated MAR (5`-TCTTTAATTTCTACTGCTTTAGAA-3`) which has lost the base unpairing capability and is not bound by SATB1, even though it is still AT-rich(14) . As described below, it appears that the context of the whole DNA binding domain is required for the C1 similarity region in SATB1 to distinguish between the wild-type and mutated MARs.
Figure 3:
Characterization of phage binding to SATB1
by a filter binding assay as described under ``Experimental
Procedures.'' A, requirement for disulfide bonds for
efficient MAR binding. C1 phage were preincubated in either regular DNA
binding buffer (DBB) or in DBB containing 5 mM dithiothreitol
for 15 min, before the addition of P-labeled MAR. B, binding specificity of C1 phage. An unrelated DNA probe,
CRBP II thyroid hormone response element, was labeled to the same
specific activity and incubated with C1 phage under the same conditions
as the MAR. C, competition assays. Various concentrations of
unlabeled wild-type MAR (25)
, mutated
MAR(24)
, or nonspecific DNA (sonicated herring
sperm DNA, average length about 150 base pairs) were added to the
filter binding reactions together with the
P-labeled
wild-type MAR. Results from one of three typical experiments are
shown.
The importance of the native SATB1
sequence mimicked by the peptides for MAR binding was examined by
deletion mutagenesis. Partial truncation of the C1 region in SATB1 DNA
binding domain greatly reduced MAR binding activity (Fig. 4A, MD-361), but additional truncations,
MD-
367 (17) and MD-
369, which deleted most, or all,
of the C4 similarity region caused no further decrease in MAR binding
activity. It may be that the C4 similarity region is a minor
contributor to MAR binding in SATB1 and that the C4 peptide represents
a more potent version of that minor binding site. Similar enhancement
of weak binding sites by peptide mimics selected from libraries has
been seen with an integrin(28) . Because of the apparent minor
role of the C4 region for SATB1 binding activity, this region was not
studied further.
Figure 4:
A,
truncations of the C1 similarity region greatly reduces the MAR
binding. A series of deletion mutants of the SATB1 MAR-binding domain
were generated by polymerase chain reaction and expressed in bacteria
as glutathione S-transferase fusion proteins. The C1 and C4
phage peptide similarity regions are indicated by a thick solid or a thin broken bar, respectively, above the diagram of
the wild-type construct. B, the residues conserved between C1
peptide and SATB1 are required for MAR binding. Depicted is the
schematic representation of the native MAR binding domain of SATB1 (MD-wt) and the various constructs derived from it (see text
for details of construction). Purified proteins were subjected to
quantitative gel mobility shift assays as described
previously(14) . Dissociation constants (K) were estimated from the amount of
protein required for a 50% shift of the probe under conditions of
protein excess and normalized to that of the wild-type construct. The
relative K
values are shown on the right.
To test the relevance of the similarity between the
C1 peptide and the two arms of the SATB1 DNA binding domain, we
replaced the native sequence (amino acids 355-362) with the C1
peptide and its variations (Fig. 4B). The resultant
chimeric SATB1 MAR binding domains were expressed as glutathione S-transferase-fusion proteins. The C1 peptide chimera (MD-C1)
yielded a gel shift pattern virtually identical to that of the
wild-type DNA binding domain (MD-wt) (Fig. 5) and exhibited
similar affinity to the wild-type MAR probe (Fig. 4B).
In contrast, a deletion mutant lacking amino-terminal 21 residues
(MD-367) and a protein produced by inserting a random peptide
(MD-XX) had greatly reduced MAR binding activities ( Fig. 4and Fig. 5). Furthermore, like MD-wt, MD-C1 chimera discriminated
between the wild-type and mutated probes, since a 200-fold molar excess
of unlabeled wild-type MAR completely inhibited binding, whereas the
mutated MAR had virtually no effect (Fig. 5). Thus, MD-C1
confers specific binding to the AT-rich sequence with high unwinding
capability.
Figure 5:
The C1 peptide can functionally replace
the similar region in SATB1 MAR binding domain. Schematic
representation of the constructs used is shown in Fig. 4. Gel
mobility shift assays were carried out by incubating identical
concentrations (16 nM) of purified bacterial SATB1 fusion
proteins (Fig. 4) with P-labeled
MAR(25)
. For competition assays, unlabeled
wild-type MAR(25)
(wt) or mutant
MAR(24)
(mut) were used at 200-fold molar
excess. Gel mobility shift assays was carried out according to the
legend to Fig. 4.
The amino acids in the C1 peptide critical for MAR
binding were studied by point mutations. Simultaneous mutations of the
three conserved residues Arg, Asn, and Glu to Ala in MD-C1 (mut
RNE-AAA) reduced the affinity of the resultant chimera for MAR to the
level of the deletion mutant MD-367 (Fig. 4, A and B). Mutation of either Arg (mut R-A) or Glu (mut E-A) alone
was sufficient to reduce binding to the same level, whereas a mutation
in Asn (mut N-A) had an intermediate effect. These results show that
specific amino acids, Arg and Glu, that are conserved among C1, the
NH
-terminal arm and COOH-terminal arm of the SATB1 MAR
binding domain are critical for MAR recognition. Thus, consistent with
the predominance of C1 phage after stringent selection, results from
peptide swapping, point mutations, and deletion mutations all suggest
that the C1 similarity region in SATB1 is a key DNA binding site.
Using a random peptide phage library without any built-in
structure, we have isolated a predominant peptide (C1) of eight amino
acids which is similar to the sequences found in SATB1 located within
the NH- and COOH-terminal regions of the 150-amino acid MAR
binding domain. Several lines of evidence suggest that this peptide
represents a MAR binding motif. First, phage displaying this peptide
can directly bind both to the MAR immobilized on an affinity column and
to the radiolabeled MAR in solution. More importantly, it could
functionally replace the similar native sequence in a chimeric SATB1
construct without loss of binding affinity or specificity. In contrast,
either deletion of this SATB1 sequence or its replacement with a random
sequence markedly reduced the MAR binding activity of the mutated
SATB1. Finally, when amino acids conserved between phage-derived
peptide and SATB1 were singly mutated to Ala, the MAR binding was
greatly reduced. Because both the NH
- and COOH-terminal
arms of the MAR binding domain are required for MAR binding, one of the
models proposed earlier is that the two arms may converge at the
surface of SATB1 to form a symmetric binding structure capable of
grasping DNA(17) . Consistent with this model, we have
demonstrated here that a subsequence of no more than 9 amino acids
within these arms can directly interact with DNA, as the predominant
peptide selected for MAR binding is similar to both arms of the SATB1
DNA binding domain and can functionally replace the similarity region.
Thus, SATB1 may be similar to TATA-binding protein in that the critical
DNA binding activities reside in two similar portions of the same
molecule (32, 33, 34) .
SATB1 is the first
non-ubiquitous MAR-binding protein cloned to date which is
predominantly expressed in a specific tissue, thymus. The observation
that SATB1 is expressed in thymocytes during specific stages of T-cell
development suggests that this protein may play a role in T-cell
maturation. In collaboration with M. Sikorska, we have
found that SATB1 may be involved in thymocyte apoptosis, for it is
proteolytically cleaved very ealy after the apoptotic induction of
thymocytes. (
)The nuclear matrix protein composition varies
depending on cell types(35) , suggesting the existence of other
cell type-specific MAR-binding proteins like SATB1. The DNA binding
sequence of SATB1 may facilitate the identification of MAR binding
potential in other proteins and may define a new class of DNA binding
motifs.
Our results demonstrate that small peptides displayed on the surface of filamentous phage can have high affinity for specific DNA sequences and can allow identification of a specific DNA binding motif within such proteins. Recently, three zinc fingers of the Zif268 protein were expressed on the phage surface, and randomization of the DNA contacting residues was used to select for zinc fingers with new DNA binding specificities(36, 37, 38) . Rather than using long peptides with built-in structures, we have demonstrated that random short peptide phage library can be effective in searching for potential DNA binding sites in DNA-binding proteins for which the DNA binding domains are unknown or have not been completely characterized. Structure/function studies of DNA binding peptides identified from peptide libraries may present new opportunities for drug discovery.