An Atypical Homeodomain in SATB1 Promotes Specific Recognition of the Key Structural Element in a Matrix Attachment Region*

(Received for publication, October 11, 1996, and in revised form, January 29, 1997)

Liliane A. Dickinson Dagger , Craig D. Dickinson § and Terumi Kohwi-Shigematsu

From the Burnham Institute, La Jolla Cancer Research Center, La Jolla, California 92037

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES


ABSTRACT

SATB1 is a cell type-specific nuclear matrix attachment region (MAR) DNA-binding protein, predominantly expressed in thymocytes. We identified an atypical homeodomain and two Cut-like repeats in SATB1, in addition to the known MAR-binding domain. The isolated MAR-binding domain recognizes a certain DNA sequence context within MARs that is highly potentiated for base unpairing. Unlike the MAR-binding domain, the homeodomain when isolated binds poorly and with low specificity to DNA. However, the combined action of the MAR-binding domain and the homeodomain allows SATB1 to specifically recognize the core unwinding element within the base-unpairing region. The core unwinding element is critical for MAR structure, since point mutations within this core abolish the unwinding propensity of the MAR. The contribution of the homeodomain is abolished by alanine substitutions of arginine 3 and arginine 5 in the N-terminal arm of the homeodomain. Site-directed mutagenesis of the core unwinding element in the 3' MAR of the immunoglobulin heavy chain gene enhancer revealed the sequence 5'-(C/A)TAATA-3' to be essential for the increase in affinity mediated by the homeodomain. SATB1 may regulate T-cell development and function at the level of higher order chromatin structure through the critical DNA structural elements within MARs.


INTRODUCTION

Eukaryotic chromosomes are thought to be separated into topologically independent loop domains by periodic attachment onto an intranuclear frame known as the nuclear matrix or skeleton, defined as the insoluble material left in the nucleus after a series of biochemical extraction steps (1). Specific DNA sequences that bind to the nuclear matrix in vitro are called matrix attachment regions (MARs),1 and these sequences have been postulated to form the base of chromosomal loops (reviewed in Refs. 2 and 3). MARs may be important to organize chromosomes and regulate DNA transcription and replication within the nucleus. In support of this notion, MARs often colocalize or are located in close proximity to regulatory sequences including enhancers (4-9), and some MARs can augment transcription from heterologous promoters in stable transformants (5-7, 10, 11). Recent evidence shows that MARs play a role in tissue-specific gene expression. The MARs associated with the immunoglobulin µ heavy chain locus are essential for transcription of a rearranged µ gene in transgenic B lymphocytes (12). Identification of the cell type-specific MAR-binding protein SATB1, which is predominantly expressed in thymocytes, shows that MARs can be specific targets for a cell type-specific factor (13).

SATB1 defines a novel class of DNA-binding proteins that recognize a specific sequence context that exhibits a high base unpairing or unwinding propensity. MARs are generally AT-rich and typically contain a subregion(s) that exhibits a strong potential to base-unpair under negative superhelical strain (10, 14). A high AT content, however, is not sufficient to confer high affinity binding to SATB1; specific mutations within MARs, which maintain the AT-richness but eradicate the unwinding capability, substantially reduce or abolish SATB1 binding (13). Analysis of SATB1 binding sites in MARs revealed that binding is restricted to the subregion of MARs that has a high unwinding propensity. This base-unpairing region consists of a cluster of sequence stretches with a special AT-rich DNA sequence context, in which Cs are sequestered exclusively on one strand and Gs on the other (ATC sequences) (13). A short core unwinding element can be present within one of these ATC sequences, which can be detected by virtue of its most persistent base unpairing even under conditions that favor the double-stranded DNA configuration; mutation of this element abolishes the base-unpairing propensity of MARs (14). The unpairing potential was demonstrated to be essential for MAR function; a concatemer, wild-type (25)7, of the core unwinding element of the 3' MAR of the immunoglobulin heavy chain (IgH) enhancer displays high binding affinity to the nuclear matrix, unwinds under superhelical strain, and enhances transcription from a linked reporter gene. A corresponding mutated version, mutated (24)8, has lost all of these properties (10).

To date, three proteins with similar binding specificity have been identified in addition to SATB1: nucleolin, a major nucleolar protein with multiple functions (15), p114, isolated from breast carcinoma (16), and Bright, a protein that is predominantly expressed in B-cells (17). These proteins bind with high affinity to MARs, and we showed that nucleolin and p114 can distinguish wild-type (25)5 from mutated (24)8. Unlike other proteins known to bind MARs such as lamin B1 (18) and topoisomerase II (19), SATB1 binds MARs with very high affinity, exhibiting dissociation constants (Kd) in the range of 10-9 to 10-10 M, comparable to many sequence-specific transcription factors.

To understand the biological role of SATB1, it is important to delineate the functional domains in this protein. A minimum 150-amino acid MAR-binding domain that contains novel DNA-binding motifs was previously identified (20). We report here that SATB1 contains an additional domain that shares homology with known homeodomains. Homeodomains are 60-amino acid DNA-binding domains, and their amino acid sequence is highly conserved, as well as their three-dimensional structure. Homeodomain proteins function in vitro and in vivo as sequence-specific transcription factors, and they are important developmental regulators that determine position or cell-type specificity (reviewed in Refs. 21 and 22). Unlike known homeodomains that directly and independently bind DNA, the homeodomain in SATB1 does not bind to the MAR probes analyzed here nor does it bind to a dimerized sequence (RP2) that resembles the homeodomain consensus sequence (23). When associated with the MAR-binding domain, however, the SATB1 homeodomain enhances binding specificity toward the core unwinding element of a MAR.


EXPERIMENTAL PROCEDURES

Protein Domain Analysis

We performed searches of the SWISS-PROT data base (release 26.0, August, 1993) using the program Blast (24) and Blitz (25). Computations were performed using the Blast server at NCBI and the Blitz server at EMBL. Best results were obtained with Blitz searches using the PAM 120 matrix and a gap penalty of 13. To lower the background of nonsignificant matches, it was necessary to remove a segment rich in glutamines and prolines from the query sequence (residues 593-619 of SATB1). The MAR-binding domain was previously delineated by the successive deletion mapping combined with gel mobility shift analysis, and the repeated regions (boxes I and II) were detected by computer-aided sequence comparisons (20).

Protein Expression

Plasmids for the fusion protein expression were constructed as follows. The desired SATB1 fragments were amplified from the human cDNA clone pAT1146 (13) by the polymerase chain reaction using Taq DNA polymerase and the appropriate primers containing a BamHI or EcoRI site. The fragments were isolated from agarose gels, purified by Elutip D columns (Schleicher & Schuell), and cloned in frame in the BamHI or BamHI/EcoRI site of the vector pGEX2T (Pharmacia Biotech Inc.). Deletion of the homeodomain was achieved by first synthesizing a fragment ranging from the N-terminal residue of the MAR domain (position 346) to the N-terminal residue of the homeodomain (position 641) and cloning it into BamHI/EcoRI-digested pGEX2T. In a second step, a fragment ranging from the C-terminal residue of the homeodomain (position 702) to the end of the cDNA was amplified and ligated in frame in the EcoRI site downstream of the insert made in the first step. Glutathione S-transferase (GST)-fusion proteins were overexpressed in Escherichia coli (XL1 Blue) and purified on glutathione-Sepharose according to standard procedures (26). Protein concentrations were determined using a Bradford protein assay kit (Bio-Rad), which was followed by quantitation of the fusion proteins by Coomassie Blue staining of SDS-polyacrylamide gels. To obtain precise comparisons, the different fusion proteins were run side by side on the same gel, and the band intensities were compared by laser densitometer scanning.

DNA-binding Assays

Gel mobility shift assays were carried out as described (13), with no poly(dI-dC)·poly(dI-dC) added or with 0.5 µg/20 µl. The 3' MAR is identical to the IgH 3'-En fragment described previously (13). The wild-type 3' MAR and the mutated fragments were subcloned in the EcoRI site of Bluescript (Stratagene), and the fragments were isolated by EcoRI restriction enzyme digestion and purification from an agarose gel. Pentamer repeats of binding sites V and VI were made exactly as described for wt(25)5 (13), using the following oligonucleotides: 5'-CTTAAAATTACTCTATTATTCGAAttc-3' with its complementary strand 5'-TTCGAATAATAGAGTAATTTTAAGgaa-3' for wt(V)5, and 5'-TTCCCTCTGATTATTGGTCTCCATGAAttc-3' with 5'-TTCATGGAGACCAATAATCAGAGGGAAgaa-3' for wt(VI)5. The lowercase letters indicate single-stranded overhangs used for end to end ligation of the double-stranded oligonucleotides.

Probes for gel mobility shift analysis were prepared by labeling isolated restriction fragments at both ends using Klenow polymerase and [32P]dATP. Under conditions of protein excess, the concentration required for half-maximal binding may be considered an estimate of the equilibrium binding coefficient (27). Autoradiographs of the gel mobility shift experiments were scanned by laser densitometry, and the percentage of free probe remaining was plotted against the protein concentration in nM.

DNA titration experiments were performed as described (28) with some modifications. The concentration of the DNA fragment to be labeled was determined using a TKO 100 minifluorometer (Hoefer Scientific Instruments), followed by agarose gel electrophoresis and ethidium bromide staining using a plasmid of known concentration as a standard. The concentration of protein that gave rise to a 40-70% shift at the lowest DNA concentration was determined empirically. All the DNA titrations were done in the presence of 0.5 µg/20 µl of poly(dI-dC). The binding reaction was incubated for 30 min at room temperature to ensure that equilibrium was reached. After electrophoresis the gels were dried and analyzed by a PhosphorImager (Bio-Rad).

Site-directed Mutagenesis

The single point mutations mut 2 to mut 7, and mut IV of the 3' MAR were previously described (14). Mut V, mut VI, and mut 8 were made by a PCR-based approach using four primer sets (29). Briefly, complementary oligonucleotides containing the desired mutations were synthesized, and they were used separately as primers in two PCR reactions with either KS or SK primer from the pBluescript polylinker region flanking the 300-bp 3' MAR. The two PCR products, one containing the desired mutation at its 5'-end and the other at its 3'-end, were mixed at an equimolar ratio, annealed, and amplified by PCR with both KS and SK primers. The amplified fragments containing the mutation were purified with a Wizard PCR preps system (Promega), digested with EcoRI, and subcloned in the EcoRI site of the vector Bluescript. Mut 8 was confirmed by Sanger sequencing. Mut V and mut VI were confirmed by the presence of an XhoI site in mut V or an SpeI site in mut VI, which were introduced by the multiple point mutations (see Fig. 3A). Alanine substitutions were introduced in the homeodomain following the Exsite PCR-based mutagenesis protocol (Stratagene), with TaqPlus and Pfu polymerase (both from Stratagene) and the pGEX2T plasmid containing the (MD + HD)-encoding insert as template. The mutations were designed to introduce a novel restriction site and were confirmed by restriction enzyme digestion and protein expression.


Fig. 3. The SATB1 homeodomain specifically increases binding affinity to the core unwinding element among three SATB1 direct contact sites in the 3' MAR. A, schematic representation of the IgH enhancer flanked by two MARs. The SATB1 binding sites are indicated by black bars and roman numerals. The ATC sequence cluster in the IgH 3' MAR is shown. Each ATC sequence is indicated by a bracket, and the SATB1 direct contact sites are shown by double-headed arrows and roman numerals. Residues that constitute the core unwinding element within site IV are indicated by filled dots. The sequences of the mutated binding sites are shown below, with an asterisk to mark the mutated residues. B, gel mobility shift experiments and binding curves comparing the affinities of (MD + HD) and (MDDelta HD) to wild-type 5' MAR, wild-type 3' MAR, and mut IV.
[View Larger Version of this Image (44K GIF file)]



RESULTS

SATB1 Contains a Homeodomain and Cut-like Repeats

In addition to the MAR-binding domain (residues 346-495) previously reported (20), computer-aided homology searches of the Swiss-Prot data base (30) identified a homeodomain homology at the C terminus of SATB1 (residues 641-702) (Fig. 1A). Many of the residues that are most conserved among homeodomains are also found in the SATB1 homeodomain, which shares 33% identity with the engrailed class of homeodomains (reviewed in Ref. 31, Fig. 1B). Identities are found with residues that in the x-ray structure of other homeodomains contribute to the hydrophobic core and residues that directly interact with DNA (32). This putative homeodomain is, however, divergent. Major differences include a single amino acid insertion at the end of the first helix and a substitution of the highly conserved WFQ motif in the third helix of known homeodomains by FFQ in both human and mouse SATB1.


Fig. 1. SATB1 contains a homeodomain and Cut-like repeats in addition to its MAR-binding domain. A, schematic representation of the overall structure of SATB1, indicating the positions of the MAR-binding domain including the amino acids at each end that are essential for MAR binding (shown in black), the homeodomain, the two Cut-like repeats, and the previously identified repeats box I and box II. B, alignment of the homeodomain in SATB1 with representative members of the different classes of homeodomain-containing proteins, defined by Scott et al. (31) (single letter amino acid code). Identical amino acids between SATB1 and other homeodomains are shown in closed boxes; open boxes indicate similar amino acids or residues in SATB1 that are identical to only one or two other members. A consensus sequence derived from the alignment is given at the bottom. The positions of the three helical regions are indicated. Dots represent residues important for structure; diamonds indicate amino acids contacting DNA, as derived from the crystal structure of the engrailed homeodomain-DNA complex (32). C, alignment of the Cut-like repeats A and B in SATB1 with the Cut proteins from Drosophila melanogaster (CUT I-III) and the mammalian Clox proteins (CLOX I-III). Box I and box II in SATB1 are underlined. Identical and similar amino acids are shown in closed or open boxes, respectively. Amino acids indicated by dots are identical or conserved in both SATB1 repeats.
[View Larger Version of this Image (70K GIF file)]


In addition to the homeodomain homology, a set of two repeats was found near the center of SATB1 (residues 370-445 and 493-568), similar to the Cut repeats of the Cut- and Clox-homeo-proteins of Drosophila and mammals, respectively. Cut proteins contain a homeodomain and three additional DNA-binding domains of 73 amino acids, called Cut repeats (33-36). The two Cut-like repeats in SATB1 (named here A and B) contain the previously documented repeats box I (residues 382-415 and 505-538) and box II (429-445 and 552-568), respectively (20) (Fig. 1, A and C). Repeat A occurs at the center of the MAR-binding domain of SATB1, but it does not include the N- and C-terminal amino acids that are mandatory for MAR binding (20). The two repeats of SATB1 are 45% identical over 75 residues with each other and display 27-35% identity with the Cut repeats. This similarity is considered to be significant, since no gaps were required for optimal alignment (Fig. 1C).

The Homeodomain Increases Binding Affinity of SATB1 to a MAR

Most homeodomain proteins contain a homeodomain as the sole DNA-binding domain. A group of homeodomain proteins have additional domains that assist the homeodomain in DNA binding specificity (reviewed in Refs. 37 and 22). In the case of SATB1, the MAR-binding domain by itself is sufficient to recognize and bind a specific region (base-unpairing region) within MARs that has a high propensity for base unpairing, and the homeodomain may have a new role in DNA recognition. To explore this possibility, glutathione S-transferase (GST)-SATB1 fusion proteins were constructed; one protein contained the MAR domain and homeodomain linked together in their natural protein context (GST(MD + HD)); one protein had the 60-amino acid homeodomain specifically deleted (GST(MDDelta HD)), and one fusion protein contained the homeodomain separately (GST(HD)) (Fig. 2A). These purified fusion proteins were used in quantitative gel mobility shift experiments with a fixed concentration of a synthetic MAR probe, wild-type (25)5, and increasing protein concentrations. This probe was derived from the core unwinding element of the MAR located 3' of the IgH enhancer, and it has the same properties as a natural MAR (10). Fig. 2B shows the gel mobility shift experiments and the binding curves that were derived from these autoradiographs. Each of these and the following gel shift experiments were repeated at least three times giving similar results. The isolated HD showed virtually no binding activity for the wt(25)5 probe; however, when HD was associated with MD (MD + HD), the binding affinity was approximately 10 times higher (Kd = 0.1 nM) than for MDDelta HD (Kd = 1.0 nM). The affinity of MDDelta HD toward wt(25)5 was virtually identical to that of the isolated MD alone (GST(MD)), indicating that the C-terminal animo acids from 496 to 763 besides HD have no additional contribution toward binding to wt(25)5 (data not shown). HD weakly bound to longer MAR fragments, but this activity was mainly nonspecific, since it could be competed by nonspecific competitors (data not shown). This effect of the homeodomain on binding affinity was confirmed by additional DNA titration experiments, in which the dissociation constants were determined using a fixed protein concentration and increasing concentrations of the DNA probe (Fig. 2C). Dissociation constants determined in this manner are independent of minor variations in the protein concentration determination or the amount of active protein in the protein preparation. The results obtained from gel mobility shift experiments were quantitated using a PhosphorImager, and the Kd values were calculated from the least squares fit of a Scatchard plot of bound/free DNA as a function of bound DNA. The approximate Kd values were estimated from the negative reciprocal of the slope, and the Kd for MD + HD (0.06 nM) was approximately 7 times lower than for MDDelta HD (0.4 nM) (Fig. 2C). The dissociation constants determined by protein titration or DNA titration were similar, indicating that nearly all the protein in the protein sample was in an active form.


Fig. 2. The homeodomain increases binding affinity of SATB1 to the wild-type (25)5 MAR, but by itself it binds poorly to this sequence. A, GST-fusion protein constructs used in gel mobility shift assays. GST(MD + HD) contains both MAR domain and homeodomain in their natural protein context; GST(MDDelta HD) is identical to the above except for specific deletion of the 60-residue homeodomain. GST(HD) contains the homeodomain. The wavy line indicates the GST moiety of the fusion proteins. Q indicates the polyglutamine stretch found at this position. B, protein titration experiments. Gel mobility shift experiments and binding curves comparing the binding activities of GST-fusion proteins (shown in A) to the wt(25)5 MAR. Protein concentrations (in nM) are indicated at the top of the lanes. (Note that the concentration range differs with each construct.) The results were quantitated by laser densitometer scanning, and the amount of free DNA was plotted against the protein concentration. The Kd value was estimated as the concentration of protein that results in a 50% shift. The experiments were repeated at least three times and gave similar results. C, DNA titration experiments with MD + HD and MDDelta HD and the wt(25)5 probe. The DNA probe concentration is indicated in nM at the top of the lanes. The ratio bound/free is plotted versus the concentration of DNA bound for each complex. All the binding reactions contained 0.5 µg/20 µl poly(dI-dC) as a competitor.
[View Larger Version of this Image (40K GIF file)]


The SATB1 Homeodomain Promotes Binding of the MAR Domain to the Core Unwinding Element of the IgH 3' MAR

SATB1 binds a variety of MARs from different species and selectively recognizes sites within MARs that are prone to become stably base-unpaired under negative superhelical strain (13).2 The structural properties and the SATB1 binding sites of the 5' MAR and the 3' MAR, which flank the IgH enhancer, were previously characterized (13, 14) (Fig. 3A). These natural MARs were used as probes in quantitative gel mobility shift experiments to determine whether HD can increase binding affinity to MARs in general. When the 5' MAR fragment was used as probe, HD had no effect on the binding affinity, both MD + HD and MDDelta HD exhibited nearly equal affinity to the 5' MAR (Kd = 7 and 10 nM, respectively) (Fig. 3B). In the case of the 3' MAR, however, the association of HD with MD increased the affinity by 6-fold compared with MD alone; the dissociation constants (Kd) for MD + HD and MDDelta HD were 2.5 and 15 nM, respectively (Fig. 3B). This differential effect of the homeodomain could be due to the different structural properties that distinguish these two MARs. Both MARs contain a base-unpairing region, but only the IgH 3' MAR has a core unwinding element. The unwinding propensity is much greater for the 3' MAR than the 5' MAR; in a supercoiled plasmid, significant unwinding can be detected in the 5' MAR only when the 3' MAR is deleted (14). The core unwinding element is defined as a short discrete site that resists base pairing even under conditions that greatly favor a double-stranded configuration, and mutation of these sites results in a complete loss of the unwinding propensity of the MAR. Previous missing nucleoside experiments (13) showed that SATB1 directly contacts three sites in the 5' MAR (sites I, II, and III) when the isolated 5' MAR was used as a substrate. Using the 3' MAR as a substrate, three adjacent ATC sequence stretches (sites IV, V, and VI) were detected as the SATB1 contact sites (Fig. 3A). Binding site IV overlaps with the core unwinding element and is the major binding site, since SATB1 makes contacts with sites V and VI only when site IV is mutated and is no longer bound (13).

To determine whether the homeodomain in SATB1 contributes to this preferential recognition of site IV, we used mutated MAR fragments as probes in gel mobility shift experiments with GST(MD + HD) and GST(MDDelta HD). Each of these mutated MARs had one of the three sites destroyed by mutation and two sites intact (Fig. 3A). The affinity of the MAR-binding domain alone (MDDelta HD) to each of the three mutated fragments was nearly the same as to wild-type 3' MAR with estimated Kd values of 15, 20, 22, and 12 nM for 3' MAR, mut IV, mutV, and mut VI, respectively (Fig. 3B, only the results for wild-type 5'- and 3' MARs and mut IV are shown). This result indicates that the MAR domain, in the absence of the homeodomain, cannot effectively distinguish among the three sites in the ATC sequence cluster. Regardless of which site was mutated, binding by (MDDelta HD) was unaffected. On the other hand, the presence of the homeodomain together with the MAR-binding domain (MD + HD) exhibited a significantly reduced binding affinity to mut IV (Kd = 14 nM) compared with wild-type 3' MAR (Kd = 2.5 nM) (Fig. 3B). No significant decrease in binding affinity was detected for MD + HD to mut V or mut VI compared with wild type, as long as site IV remained intact (data not shown). These results also show that the HD-mediated increase in affinity to the 3' MAR does not merely reflect a cooperativity of binding, caused by the presence of adjacent binding sites. If this were the case, any one of the three mutations would be expected to abolish the effect of the homeodomain and not just mutation of site IV. In fact, binding of (MD + HD) is virtually noncooperative, since a Hill coefficient of 1.2 was determined. On the contrary, the weaker binding of (MDDelta HD) to 3' MAR appears to be cooperative, with a Hill coefficient of 2.2 (data not shown).

The contribution of the homeodomain in directing SATB1 to the core unwinding element was further confirmed by employing concatemers of each site with short surrounding sequences as probes in gel mobility shift experiments (data not shown). The concatemer wt(IV)5 is identical to the previously described synthetic MAR wild type (25)5 (10). If the homeodomain does assist the MAR-binding domain to preferentially recognize the core unwinding element, it should specifically increase affinity to wt(IV)5 but not to wt(V)5 or wt(VI)5. Indeed, the increase in binding affinity observed with (MD + HD) compared with (MDDelta HD) was 10-fold for wt(IV)5 but less than 2-fold for wt(V)5 and wt(VI)5 (data not shown). Thus, the homeodomain of SATB1 contributes to binding specificity by selectively increasing the affinity to site IV that contains the wild-type core unwinding element. It should be noted that, although the MAR-binding domain alone cannot distinguish among the three sites in the natural context of the 3' MAR, when each binding site was concatemerized and used separately as probe, (MDDelta HD) showed moderate preference for site IV over site V and site VI. This preference for site IV, however, was much more pronounced when MD was associated with HD.

These results strongly indicate that in the context of the 3' MAR fragment, the MAR-binding domain of SATB1 is sufficient for the ATC sequence context recognition, because it can bind to any one of the three sites in the ATC sequence cluster of the IgH 3' MAR with comparable affinities. The specific recognition of the core unwinding element within the 300-bp MAR fragment, however, requires the association of the MAR-binding domain with the homeodomain. The homeodomain appears to direct SATB1 toward a preferential recognition of the core unwinding element in a cluster of ATC sequences, as illustrated in Fig. 5.


Fig. 5. The homeodomain acts in conjunction with the MAR-binding domain in SATB1 to confer a higher level of specificity to binding site recognition. SATB1 protein with the MAR-binding domain and the homeodomain is schematically represented at the top. The isolated MAR-binding domain is shown to bind to the base-unpairing region of the IgH 3' MAR, contacting each of the three binding sites with comparable affinities. Site IV contains the core unwinding element (indicated by an open rectangle). The MAR-binding domain together with the homeodomain specifically bind to the core unwinding element within the base-unpairing region of the MAR fragment.
[View Larger Version of this Image (16K GIF file)]


Specific Mutations within the N-terminal Arm of the SATB1 Homeodomain Reduce Homeodomain Activity

Homeodomains generally contact DNA by two separate regions, an N-terminal arm lies in the minor groove and specific DNA contacts are mediated by Arg-3 and Arg-5. The third alpha -helix or recognition helix fits in the major groove of the recognition site, and Gln-50 and Asn-51 were shown to specifically contact DNA (32, 37). These residues are conserved in the SATB1 homeodomain, and we tested by site-directed mutagenesis whether these residues are required for the homeodomain-mediated increase in affinity. In GST-(MD + HD) Arg-3 and Arg-5 of the N-terminal arm of the homeodomain were substituted with alanine residues (mutR3R5), and in the putative third helix the FQN motif (position 50-52) was replaced with alanine residues (mutFQN) (Fig. 4A). Mut R3R5 showed a 4.4-fold decrease in binding affinity to the 3' MAR in comparison to that of wild-type MD + HD (Fig. 4B). The effect of mut R3R5 is, therefore, comparable to the effect of the homeodomain deletion that resulted in a 6-fold decrease in affinity. Mut FQN showed an intermediate effect on binding affinity by exhibiting a 2.4-fold decrease in binding (Fig. 4B). Thus, the major contribution of the homeodomain is mediated by its N-terminal arm, most likely in the minor groove. This binding may be supported by the interaction of the third helix of the homeodomain in the major groove.


Fig. 4. Specific amino acid residues in the N-terminal arm and the third helix are required for homeodomain activity, and specific nucleotides in the core unwinding element (site IV) are critical for SATB1 homeodomain recognition. A, amino acid sequence of the SATB1 homeodomain. The putative alpha -helices are drawn as open boxes. The mutated proteins were derived from (MD + HD), and the amino acid substitutions to alanine in mut R3R5 and mut FQN are indicated, and the dashed line represents unchanged residues. B, left panel, list of the mutations that were introduced in the 3' MAR (described in Ref. 14 except for mut 8). Mutated nucleotides are shown in bold characters with an asterisk. SATB1 binding site IV is indicated on the top, and the sequence necessary for HD interaction is shown at the bottom. Right panel, relative Kd values derived from gel mobility shift experiments with the respective fragments. Kd relative = 1 corresponds to 2.5 nM. For each mutation the entire 300-bp fragment was used in the gel mobility shift experiments. Each experiment was repeated at least two times.
[View Larger Version of this Image (34K GIF file)]


The Homeodomain Recognizes a Short (C/A)TAATA Motif That Colocalizes with the Core Unwinding Element

To examine if specific residues in binding site IV are necessary for homeodomain recognition, we analyzed a series of single point mutations as shown in Fig. 4B, left panel. Among these, mut 4, mut 5, and mut 6 each had one of the three base substitutions made in mut IV. These single point mutations did not alter the high unpairing propensity of DNA sequences in the 3' MAR (14). When Kd values were determined for (MD + HD) versus (MDDelta HD) using these singly mutated fragments, it was found that the homeodomain did not increase binding affinity of the GST-fused SATB1 to mut 5, mut 6, or mut 7 (just like for mut IV). Mut 8, in which 5'-CTAATA-3' was replaced with 5'-ATAATA-3', had an intermediate effect; the homeodomain still increased binding affinity to mut 8, although to a lesser extent than wild type. These experiments show that the specific sequence 5'-(C/A)TAATA-3' (742-747), located within binding site IV 5'-TTCTAATATAT-3' (740-750), is essential for recognition by the SATB1 homeodomain. The MAR domain alone did not distinguish the point mutations in the 300-bp 3' MAR fragment; the Kd values for (MDDelta HD) were essentially the same for wild-type 3' MAR, mut IV, and mut 2-8 (Fig. 4B). Furthermore, mut R3R5 had a similar effect on binding affinity as the homeodomain deletion (MDDelta HD). This series of experiments indicates that the specificity of SATB1 toward the core unwinding element of the 3' MAR is achieved by the presence of both MAR-binding domain and the homeodomain. It remains to be established whether the homeodomain, when linked to the MAR-domain in the natural protein context, directly contacts DNA.


DISCUSSION

SATB1, a cell type-specific MAR-binding protein essential for T-cell development, contains a MAR-binding domain and a newly identified atypical homeodomain. These two domains act together to confer binding specificity toward the core unwinding element found within a MAR.

Multiple Domain Structure of SATB1

The MAR-binding protein SATB1 contains a MAR-binding domain, a homeodomain, and two Cut-like repeats. The SATB1 homeodomain is unique among known homeodomains; a striking feature is the replacement of the invariant tryptophan at position 49 of the homeodomain with a phenylalanine in SATB1. This may have important implications for structure and function of the protein, since tryptophan 49 is not only conserved in all the homeodomains so far identified but is also essential for homeodomain function. Mutations of the WFQ motif containing the tryptophan 49 in the oct-1 homeodomain abolished DNA binding (38), and the mutant phenotype of dwarf mice, characterized by abnormal development of the anterior pituitary gland, is caused by a single point mutation that replaces tryptophan with cysteine in the POU homeodomain of pit-1 (39).

The presence of Cut-like repeats and a homeodomain in SATB1 suggests structural similarity to the Cut proteins identified from various species (34, 36, 40, 41). Cut proteins contain a set of three cut repeats followed by a homeodomain. The phenotype of mutants in Drosophila suggests a role for cut protein in cell specification in several tissues including the wing (42), the external sensory organs (34), and Malpighian tubules (43). SATB1 may be considered a distant relative of the cut family of proteins; however, the SATB1 homeodomain shares more homology with the homeodomain of engrailed (33% identity) than with that of Cut proteins (26% identity). Furthermore, unlike known Cut repeats that were shown to be specific DNA-binding domains (33, 35, 44), the Cut-like repeats in SATB1 did not appear to bind SATB1-binding sites. It remains to be established if the SATB1 cut-like repeats recognize other DNA sequences that were not tested here.

Homeodomain Contribution to MAR Binding

The isolated SATB1 homeodomain exhibits only very weak nonspecific binding activity to base-unpairing sequences. The MAR domain, on the other hand, can bind independently with high affinity and specificity; it distinguishes MARs that can unwind from mutated MARs that have lost this capability. Thus, the homeodomain initially appeared to be nonsignificant in DNA binding. However, a unique function is now attributed to this homeodomain. When associated with the MAR domain in the natural protein context, the SATB1 homeodomain directs the MAR domain to the core unwinding element of a MAR. This distinguishes SATB1 from the way by which Paired protein, the Drosophila Cut, and the mammalian Cut-like proteins recognize their target DNA. In these proteins, the homeodomains bind DNA independently, and the associated domains contribute to binding specificity by making additional DNA contacts (33, 35, 44, 45). The SATB1 homeodomain is similar to the homeodomains in the POU transcription factors, which cannot bind independently or bind with low affinity and relaxed specificity (reviewed in Ref. 46). In the case of the POU transcription factors, both the POU domains and the homeodomains are equally required for high affinity binding, and together they form a bipartite binding domain (38). For SATB1, on the other hand, the MAR domain alone displays fully functional MAR-binding activity, and the contribution of the homeodomain results in further selection of specific elements embedded within a MAR sequence context. The contribution of the homeodomain is small, however, and was previously missed when the minimum domain that confers MAR binding was delineated (20). This in part could be due to the active protein component in the full-sized bacterially produced SATB1 not being accurately determined.

The dissection of SATB1 protein in individual components has brought to light how these multiple levels of recognition are ultimately put together to achieve a high degree of binding site specificity that is unprecedented among MAR-binding proteins. This is illustrated in Fig. 5. We had previously shown that SATB1 does not bind MARs merely on the basis of their high AT content but that it specifically recognizes AT-rich regions in MARs that have a high propensity for base unpairing, and within these base-unpairing regions it exhibits a preference for binding to the core unwinding element (13). First, we showed in a separate study using a phage display library of random peptides that a short peptide homologous to the N-terminal arm of the MAR-binding domain can effectively recognize AT-rich DNA (47). This suggests that the short homologous N- and C-terminal amino acid stretches of the MAR-binding domain are individually sufficient for recognizing AT-rich DNA, but to distinguish between AT-rich DNA with high unwinding propensity and DNA that lacks this property, the entire 150-amino acid MAR-binding domain is required. Within an AT-rich DNA sequence with high unwinding propensity, the specific recognition of the core unwinding element that is critical in affecting overall DNA structure of the MAR (14) is achieved by the combined action of a unique homeodomain and a MAR-binding domain. Core unwinding elements have been identified in several other MARs, such as in the MAR at the 5' boundary of the human beta -globin locus control region (48).2 These elements are remarkably similar to the SATB1 homeodomain recognition site of the IgH 3' MAR, which suggests that SATB1 may exhibit preference for core unwinding elements in general.

Unusual Mode of Binding of the MAR Domain and Homeodomain in SATB1

The MAR-binding domain in SATB1 binds DNA in the minor groove, making little contact with DNA bases. SATB1 presumably recognizes DNA sequences indirectly by binding to the altered sugar phosphate backbone structure dictated by a specific DNA sequence context (13). Although the homeodomain in SATB1 does not bind DNA independently, mutagenesis of the target DNA revealed that a specific sequence 5'-(C/A)TAATA-3', in the SATB1 binding site IV, is necessary for the increase in affinity mediated by the homeodomain. Furthermore, the increase in affinity was almost completely abolished by alanine substitutions of arginine residues in the N-terminal arm of the SATB1 homeodomain, which is known in other homeodomains to bind the minor groove. The corresponding region for other homeodomain was found to be flexible and lack any secondary structure as shown by NMR and x-ray crystallography (reviewed in Ref. 49). Therefore, the effect resulting from alanine substitutions of the two arginine residues is unlikely to be a consequence of the subsequent change in the overall protein folding. These results taken together suggest, but do not prove, that the homeodomain, in the context of the SATB1 protein, may directly contact the target DNA site in the minor groove. Unlike other homeodomains, mutagenesis of residues in the third helix, which is known to interact with the major groove, has only a minor effect on SATB1 binding. This finding is consistent with previous results showing that SATB1 is a minor groove binding protein.

The SATB1 homeodomain recognition sequence found in site IV is similar to the homeodomain binding site consensus, TAAT core (22, 50), and it overlaps with the direct SATB1 contact site IV. Missing nucleoside experiments revealed no additional contacts with (MD + HD) compared with (MDDelta HD) (data not shown). This result, taken together with the fact that the sequence 5'-(C/A)TAATA-3' in site IV is responsible for the positive effect of the homeodomain in SATB1 binding, may suggest that upon binding to a MAR, the SATB1 homeodomain and the MAR domain contact the same site simultaneously, possibly from opposite sides of the DNA helix. Crystal structural analysis must be done to determine whether the SATB1 homeodomain in its natural protein context directly makes contact with DNA. It is of interest that the crystal structure of the even-skipped homeodomain showed that two homeodomains are bound by one 10-bp consensus sequence on both faces of the DNA, without any steric hindrance (51). This simultaneous occupation of one site from both sides of the DNA helix could provide significant stability to the protein-DNA complex. This protein-DNA interaction is unusual, however. The multiple DNA-binding domains found in the POU, Cut, and the Paired proteins bind to sites that are juxtaposed. Similarly, in the transcription factor oct-1, the POU-specific domain and the homeodomain were suggested to occupy adjacent positions in the major groove (52).

Biological Significance of SATB1 Recognition of MARs

Homeodomains represent the hallmark of developmental regulatory proteins (reviewed in Ref. 21), and the presence of this domain in a MAR-binding protein is unprecedented. In this regard, SATB1 is unique among several other proteins that preferentially bind MARs in vitro including nucleolin (15), topoisomerase II (19), histone H1 (53), the high mobility group proteins HMG I/Y (54), lamin B1 (18), ARBP (55), and hnRNPU (SAF-A) (56-58). In fact, a recent study of SATB1 knockout mice showed that SATB1 ablation results in a major defect in T-cell development and alterations in expression of multiple genes.3 Genomic DNA sequences that are bound to SATB1 in vivo have recently been characterized based on cross-linking techniques. This study revealed that in the nucleus SATB1 actually binds DNA sequences containing ATC sequence clusters and that these sequences are tightly bound to the nuclear matrix, representing MARs.4 This result, together with the results from the SATB1 knockout experiments, suggests that higher order chromatin structure may be involved in T-cell-specific gene regulation. Such regulation could be directed toward MARs at the base of chromatin loops, in particular toward the core unwinding elements, as specified by the combined action of the MAR-binding domain and the homeodomain of SATB1.


FOOTNOTES

*   This work was supported by National Institutes of Health Grant ROI CA 39681 (to T. K.-S.).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Dagger    Present address: Dept. of Molecular Biology, The Scripps Research Institute, 10550 N. Torrey Pines Road, MB-27, La Jolla, CA 92037.
§   Present address: Dept. of Immunology, The Scripps Research Institute, 10550 N. Torrey Pines Road, IMM-17, La Jolla, CA 92037.
   Present address and to whom correspondence should be addressed: Life Science Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Rd., Mail Stop 70A-1118, University of California, Berkeley, CA 94720. Tel.: 510-486-4983; Fax: 510-486-4545; E-mail: terumiks @lbl.gov.
1   The abbreviations used are: MAR, matrix attachment region; GST, glutathione S-transferase; wt, wild type; mut, mutant; bp, base pair(s); PCR, polymerase chain reaction.
2   L. A. Dickinson and T. Kohwi-Shigematsu, unpublished results.
3   J. D. Alvarez, H. Niida, T. Kohwi-Shigematsu, and D. Y. Loh, unpublished results.
4   I. de Belle, S. Cai, and T. Kohwi-Shigematsu, unpublished results.

ACKNOWLEDGEMENTS

We thank Dr. Yoshinori Kohwi for valuable discussions, Dr. Craig Hauser for helpful comments and critical reading of the manuscript, and Dr. Joel Gottesfeld for expert advice and constructive criticism of the manuscript.


REFERENCES

  1. Nelson, W. G., Pienta, K. J., Barrack, E. R., and Coffey, D. S. (1986) Annu. Rev. Biophys. Biophys. Chem. 15, 457-475 [CrossRef][Medline] [Order article via Infotrieve]
  2. Earnshaw, W. C. (1988) BioEssays 9, 147-150 [Medline] [Order article via Infotrieve]
  3. Gasser, S. M., and Laemmli, U. K. (1987) Trends Genet. 3, 16-22 [CrossRef]
  4. Cockerill, P. N., Yuen, M.-H., and Garrard, W. T. (1987) J. Biol. Chem. 262, 5394-5397 [Abstract/Free Full Text]
  5. Klehr, D., Maass, K., and Bode, J. (1991) Biochemistry 30, 1264-1270 [Medline] [Order article via Infotrieve]
  6. Mielke, C., Kohwi, Y., Kohwi-Shigematsu, T., and Bode, J. (1990) Biochemistry 29, 7475-7485 [Medline] [Order article via Infotrieve]
  7. Poljak, L., Seum, C., Mattioni, T., and Laemmli, U. K. (1994) Nucleic Acids Res. 22, 4386-4394 [Abstract]
  8. Jarman, A. P., and Higgs, D. R. (1988) EMBO J. 7, 3337-3344 [Abstract]
  9. Gasser, S. M., and Laemmli, U. K. (1986) EMBO J. 5, 511-518
  10. Bode, J., Kohwi, Y., Dickinson, L., Joh, T., Klehr, D., Mielke, C., and Kohwi-Shigematsu, T. (1992) Science 255, 195-197 [Medline] [Order article via Infotrieve]
  11. Dietz, A., Kay, V., Schlake, T., Landsmann, J., and Bode, J. (1994) Nucleic Acids Res. 22, 2744-2751 [Abstract]
  12. Forrester, W. C., van Genderen, C., Jenuwein, T., and Grosschedl, R. (1994) Science 265, 1221-1225 [Medline] [Order article via Infotrieve]
  13. Dickinson, L. A., Joh, T., Kohwi, Y., and Kohwi-Shigematsu, T. (1992) Cell 70, 631-645 [Medline] [Order article via Infotrieve]
  14. Kohwi-Shigematsu, T., and Kohwi, Y. (1990) Biochemistry 29, 9551-9560 [Medline] [Order article via Infotrieve]
  15. Dickinson, L. A., and Kohwi-Shigematsu, T. (1995) Mol. Cell. Biol. 15, 456-465 [Abstract]
  16. Yanagisawa, J., Ando, J., Nakayama, J., Kohwi, Y., and Kohwi-Shigematsu, T. (1996) Cancer Res. 56, 457-462 [Abstract]
  17. Herrscher, R. F., Kaplan, M. H., Lelsz, D. L., Das, C., Sheuermann, R., and Tucker, P. W. (1995) Genes Dev. 9, 3067-3082 [Abstract]
  18. Ludérus, M. E. E., de Graaf, A., Mattia, E., den Blaauwen, J. L., Grande, M. A., de Jong, L., and van Driel, R. (1992) Cell 70, 949-959 [Medline] [Order article via Infotrieve]
  19. Adachi, Y., Käs, E., and Laemmli, U. K. (1989) EMBO J. 8, 3997-4006 [Abstract]
  20. Nakagomi, K., Kohwi, Y., Dickinson, L. A., and Kohwi-Shigematsu, T. (1994) Mol. Cell. Biol. 14, 1852-1860 [Abstract]
  21. Gehring, W. J. (1987) Science 236, 1245-1252 [Medline] [Order article via Infotrieve]
  22. Treisman, J., Harris, E., Wilson, D., and Desplan, C. (1992) BioEssays 145-150
  23. Desplan, C., Theis, J., and O'Farrell, P. H. (1988) Cell 54, 1081-1090 [Medline] [Order article via Infotrieve]
  24. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) J. Mol. Biol. 215, 403-410 [CrossRef][Medline] [Order article via Infotrieve]
  25. Sturrock, S. S., and Collins, J. F. (1993) MPsrch, Version 1.3. Biocomputing Research Unit, University of Edinburgh, UK
  26. Smith, D. B., and Johnson, K. S. (1988) Gene (Amst.) 67, 31-40 [CrossRef][Medline] [Order article via Infotrieve]
  27. Ekker, S. C., Young, K. E., von Kessler, D., and Beachy, P. A. (1991) EMBO J. 10, 1179-1186 [Abstract]
  28. Clemens, K. R., Zhang, P., Liao, X., McBryant, S. J., Wright, P. E., and Gottesfeld, J. M. (1994) J. Mol. Biol. 244, 23-35 [CrossRef][Medline] [Order article via Infotrieve]
  29. Higuchi, R., Krummel, B., and Saiki, R. K. (1988) Nucleic Acids Res. 16, 7351-7367 [Abstract]
  30. Bairoch, A., and Boeckmann, B. (1994) Nucleic Acids Res. 22, 3578-3580 [Abstract]
  31. Scott, M. P., Tamkun, J. W., and Hartzell, G. W., III (1989) Biochim. Biophys. Acta 989, 25-48 [CrossRef][Medline] [Order article via Infotrieve]
  32. Kissinger, C. R., Liu, B., Martin-Blanco, E., Kornberg, T. B., and Pabo, C. O. (1990) Cell 63, 579-590 [Medline] [Order article via Infotrieve]
  33. Andrés, V., Chiara, M. D., and Mahdavi, V. (1994) Genes Dev. 8, 245-257 [Abstract]
  34. Blochlinger, K., Bodmer, R., Jack, J., Jan, L. Y., and Jan, Y. N. (1988) Nature 333, 629-635 [CrossRef][Medline] [Order article via Infotrieve]
  35. Harada, R., Dufort, D., Denis-Larose, C., and Nepveu, A. (1994) J. Biol. Chem. 269, 2062-2067 [Abstract/Free Full Text]
  36. Neufeld, E. J., Skalnik, D. G., Lievens, P. M., and Orkin, S. H. (1992) Nat. Genet. 1, 50-55 [Medline] [Order article via Infotrieve]
  37. Gehring, W. J., Qian, Y. Q., Billeter, M., Furukubo-Tokunaga, K., Schier, A. F., Resendez-Perez, D., Affolter, M., Otting, G., and Wüthrich, K. (1994) Cell 78, 211-223 [Medline] [Order article via Infotrieve]
  38. Sturm, R. A., and Herr, W. (1988) Nature 336, 601-604 [CrossRef][Medline] [Order article via Infotrieve]
  39. Li, S., Crenshaw, E. B., III, Rawson, E. J., Simmons, D. M., Swanson, L. W., and Rosenfeld, M. G. (1990) Nature 347, 528-533 [CrossRef][Medline] [Order article via Infotrieve]
  40. Andrés, V., Nadal-Ginard, B., and Mahdavi, V. (1992) Development 116, 321-334 [Abstract/Free Full Text]
  41. Valarché, I., Tissier-Seta, J. P., Hirsch, M. R., Martinez, S., Goridis, C., and Brunet, J. F. (1993) Development 119, 881-896 [Abstract/Free Full Text]
  42. Jack, J., Dorsett, D., Delotto, Y., and Liu, S. (1991) Development 113, 735-747 [Abstract]
  43. Liu, S., and Jack, J. (1992) Dev. Biol. 150, 133-143 [Medline] [Order article via Infotrieve]
  44. Harada, R., Bérubé, G., Tamplin, O. J., Denis-Larose, C., and Nepveu, A. (1995) Mol. Cell. Biol. 15, 129-140 [Abstract]
  45. Treisman, J., Harris, E., and Desplan, C. (1991) Genes Dev. 5, 594-604 [Abstract]
  46. Rosenfeld, M. G. (1991) Genes Dev. 5, 897-907 [CrossRef][Medline] [Order article via Infotrieve]
  47. Wang, B., Dickinson, L. A., Koivunen, E., Ruoslahti, E., and KohwiShigematsu, T. (1995) J. Biol. Chem. 270, 23239-23242 [Abstract/Free Full Text]
  48. Yu, J., Bock, J. H., Slightom, J. L., and Villeponteau, B. (1994) Gene (Amst.) 139, 139-145 [CrossRef][Medline] [Order article via Infotrieve]
  49. Laughon, A. (1991) Biochemistry 30, 11357-11367 [Medline] [Order article via Infotrieve]
  50. Müller, M., Affolter, M., Leupin, W., Otting, G., Wüthrich, K., and Gehring, W. J. (1988) EMBO J. 7, 4299-4304 [Abstract]
  51. Hirsch, J. A., and Aggarwal, A. K. (1995) EMBO J. 14, 6280-6291 [Abstract]
  52. Dekker, N., Cox, M., Boelens, R., Verrijzer, C. P., van der Vliet, P. C., and Kaptein, R. (1993) Nature 852-855
  53. Izaurralde, E., Käs, E., and Laemmli, U. K. (1989) J. Mol. Biol. 210, 573-585 [Medline] [Order article via Infotrieve]
  54. Zhao, K., Käs, E., Gonzalez, E., and Laemmli, U. K. (1993) EMBO J. 12, 3237-3247 [Abstract]
  55. von Kries, J. P., Buhrmester, H., and Strätling, W. H. (1991) Cell 64, 123-135 [Medline] [Order article via Infotrieve]
  56. Fackelmayer, F. O., Dahm, K., Renz, A., Ramsperger, U., and Richter, A. (1994) Eur. J. Biochem. 221, 749-757 [Abstract]
  57. Tsutsui, K., Tsutsui, K., Okada, S., Watarai, S., Seki, S., Yasuda, T., and Shohmori, T. (1993) J. Biol. Chem. 268, 12886-12894 [Abstract/Free Full Text]
  58. von Kries, J. P., Buck, F., and Strätling, W. H. (1994) Nucleic Acids Res. 22, 1215-1220 [Abstract]

©1997 by The American Society for Biochemistry and Molecular Biology, Inc.