A Linker Region of the Yeast Zinc Cluster Protein Leu3p Specifies Binding to Everted Repeat DNA*

Yaël MamaneDagger , Karen HellauerDagger §, Marie-Hélène RochonDagger , and Bernard TurcotteDagger §

From the § Department of Medicine, Royal Victoria Hospital and Dagger  Department of Microbiology and Immunology, McGill University, Montréal, Québec, Canada H3A 1A1

    ABSTRACT
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Yeast zinc cluster proteins form a major class of yeast transcriptional regulators. They usually bind as homodimers to target DNA sequences, with each monomer recognizing a CGG triplet. Orientation and spacing between the CGG triplet specifies the recognition sequence for a given zinc cluster protein. For instance, Gal4p binds to inverted CGG triplets spaced by 11 base pairs whereas Ppr1p recognizes a similar motif but with a spacing of 6 base pairs. Hap1p, another member of this family, binds to a direct repeat consisting of two CGG triplets. Other members of this family, such as Leu3p, also recognize CGG triplets but when oriented in opposite directions, an everted repeat. This implies that the two zinc clusters of Leu3p bound to an everted repeat must be oriented in opposite directions to those of Gal4p or Ppr1p bound to inverted repeats. In order to map the domain responsible for proper orientation of the zinc clusters of Leu3p, we constructed chimeric proteins between Leu3p and Ppr1p and tested their binding to a Leu3p and a Ppr1p site. Our results show that the linker region, which bridges the zinc cluster to the dimerization domain, specifies binding of Leu3p to an everted repeat. We propose that the Leu3p linker projects the two zinc clusters of a Leu3p homodimer in opposite directions allowing binding to everted repeats.

    INTRODUCTION
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Sequencing of the yeast (Saccharomyces cerevisiae) genome has revealed the existence of over 50 proteins that are all characterized by the presence of 6 cysteine residues (consensus sequence: CX2CX6CX5-9CX2CX6-8C) (1, 2). These cysteines are essential for binding to two zinc atoms involved in proper folding of the cysteine-rich region and are referred to as zinc cluster proteins (3). Although some zinc cluster proteins bind as monomers (4, 5), many of them have been shown to bind specifically to DNA as homodimers with each monomer recognizing a CGG triplet. The most conserved region of the zinc cluster proteins is the zinc cluster itself. The zinc cluster is followed by a linker region of approximately 15 amino acids (aa)1 in length, which bridges the zinc cluster to the dimerization domain. The latter has a coiled-coil structure consisting of heptad repeats with a predominance of hydrophobic residues at the first and fourth position of the repeat.

Two strategies aimed at generating a diversity of binding sites for the zinc cluster proteins have been identified. First, the zinc cluster proteins usually recognize three different DNA motifs. For instance, the transcriptional activators of the galactose (Gal4p) and pyrimidine (Ppr1p) pathways recognize CGG triplets when oriented as inverted repeats (also called palindromes) (CGG Nx CCG). Another member of this family, Hap1p, involved in the activation of genes related to cellular respiration, also recognizes CGG triplets but only when oriented as a direct repeat (CGG Nx CGG) (6, 7). More recently, we have identified a third variation of the DNA motif as shown by the transcriptional activators Leu3p and Pdr3p, which are involved in controlling expression of genes related to leucine biosynthesis and multi-drug resistance, respectively. Leu3p and Pdr3p recognize CGG triplets oriented in opposite directions (CCG Nx CGG, see Figs. 1B and 5), a motif called an everted repeat in analogy to sites found for some targets of the nuclear receptors like those for retinoic acid (see Ref. 8 and references therein). Alternatively, DNA targets of Leu3p could be considered to be an inverted repeat with the sequence CCG. However, a chimeric protein in which the zinc cluster of Gal4p was replaced by the corresponding region of Leu3p was shown to recognize a Gal4p site (CGG N11 CCG) but not an inverted repeat with the sequence CCG N11 CGG (9). Because Leu3p binds to DNA as a homodimer (10), these observations imply that the two zinc clusters of Leu3p (and most probably Pdr3) must be oriented in opposite directions, unlike those of Gal4p or Ppr1p where they have been shown to face each other (11, 12). Thus, three DNA motifs are used by the members of the family of zinc cluster proteins: inverted, direct, and everted repeats.

Diversity of target sites is further increased by changing the spacing between the CGG triplets. Gal4p binds with high affinity to an inverted repeat only when the spacing between CGG triplets is 11 base pairs (bp) (13, 14). Other members of this family recognize inverted repeats with distinct spacings: 10 bp for the transcriptional activator of the proline pathway, Put3p (15) or 6 bp for Ppr1p (13, 16). The binding of Hap1p is restricted to direct repeats with a spacing of 6 bp (6, 7). Similarly, binding of Leu3p or Pdr3p to everted repeats is strictly dependent on spacing between the CGG triplets of 4 bp and 0 bp, respectively (9). Thus, the requirement for a proper spacing is important in generating a diversity of target sites. Binding of a given zinc cluster protein to an alternate DNA motif does not appear to occur. For example, all the known target sites for Gal4p or the Kluyveromyces lactis homologue Lac9p are inverted repeats (see Refs. 14, 17, and 18 for a compilation of the DNA targets). Only direct repeats were recovered from a random site selection performed with Hap1p (7), in agreement with the fact that all known targets of Hap1p are imperfect direct repeats (6, 19-24).

The crystal structure of the Gal4p DNA binding domain bound to a consensus DNA site revealed a symmetrical structure of the two monomers, each interacting with a CGG triplet (11). The linker has an extended structure and the dimerization domain lies perpendicular to the minor groove, in the middle of the upstream activating sequence (UAS). Other studies have revealed a remarkable similarity in the zinc cluster structures of Ppr1p, Put3p, and Gal4p (12, 25) (reviewed in Ref. 26). However, the linker region of Ppr1p has a folded structure and the Ppr1p linker and the coiled-coil of each monomer are arranged asymmetrically. Folding of the linker renders Ppr1p dimers more compact, leading to recognition of CGG inverted repeats with a reduced spacing of 6 bp. These results are in agreement with studies of Gal4p, Ppr1p, and Put3p chimeras, which showed that the linker region and the beginning of the dimerization domain specify binding to an inverted repeat of a given spacing (27). However, use of chimeric proteins between Hap1p and Ppr1p has demonstrated that, in contrast to Gal4p and Ppr1p, the zinc cluster of Hap1p is solely responsible for positioning the two monomers to a direct repeat (28).

We wished to determine the region of the zinc cluster proteins responsible for binding to an inverted motif or an everted repeat. We constructed chimeric proteins derived from Leu3p and Ppr1p and tested their binding in an electrophoretic mobility shift assay (EMSA). Our results show that the linker region of Leu3p specifies binding to an everted repeat.

    EXPERIMENTAL PROCEDURES
Top
Abstract
Introduction
Procedures
Results
Discussion
References

GST Expression Vectors-- A portion of the PPR1 gene (29) that encodes its DNA binding domain (aa 1-144) was PCR amplified with Vent polymerase (New England Biolabs) using the primers 5'-CGGGATCCATGAAGCAGAAAAAATTTAA-3' (initiator codon in bold) and 5'-GGAATTCTATTTCTGAACCAAA-3' and yeast genomic DNA isolated from the strain YPH499 (30) as a template. The PCR product was cut by BamHI and EcoRI, purified on a column (PCR purification kit, Promega), and subcloned into pGEX-f (9) cut with the same enzymes to give pPPR1(1-144). This plasmid has the PPR1 coding sequences in frame with the GST sequences and a stop codon, introduced during PCR amplification, after aa 144. Sequencing of the coding region of PPR1 showed that no mutations were introduced during PCR amplification. A similar expression vector for Leu3p, pLEU3(1-147), has been described previously (9). Two strategies were used to generate expression vectors for chimeric proteins: site-directed mutagenesis (31) and PCR using the megaprimer technique (32). All the constructs were sequenced to confirm their integrity.

To facilitate construction of expression vectors for chimeric proteins, we introduced, in the same reading frame, a unique site (SphI or BglII) within specific regions of the LEU3 and PPR1. SphI sites were introduced at PPR1 sequences corresponding to the first or the last cysteine as described for LEU3 (9) using the oligonucleotides 5'-TCTAGAACTGCATGCAAACGATGTCGATT-3' and 5'-CAAAATTAGAGGTAGCATGCGTTTCTTTGGACCC-3'. A BglII site was also introduced in LEU3 at the border between the linker and the dimerization domain using the oligonucleotide 5'-GAACTTATAAAAGAAGATCTAACGAAGCCATTGA-3'. PPR1 possesses a natural BglII at this position. To construct chimeras, PPR1 and LEU3 DNAs were digested with SphI or BglII and the various fragments derived from LEU3 or PPR1 ligated to give LP-A, LP-D, LP-F, LP-H, LP-5 (encoding the N terminus and the zinc cluster of LEU3 followed by the linker and the dimerization domain of PPR1), and LP-6 (encoding the N terminus and the zinc cluster of PPR1 followed by the linker and the dimerization domain of LEU3). LP-E, LP-G, LP-B, and Leu3-1 were constructed by using PCR with Vent polymerase (New England BioLabs). LP-E (extended version of LP-D) was constructed by amplifying a portion of pLEU3(1-147) using the primers 5'-AGGACGTTCCAAGATCTTACGTCTTTTTTCTGGAAGATAGATTCAAGGAACTCACCAGAA-3' and 3'-GST 5'-CCGGGAGCTGCATGTGTCAGAGG-3'. The PCR product was digested with BglII and EcoRI, purified on a column (PCR purification kit; Promega), and subcloned into pLP-D cut with BglII and EcoRI. LP-I (an extended version of LP-H) was constructed by using the megaprimer technique. A region of LP-H was amplified with Vent (New England BioLabs) PCR using the primers 5'-AGAAGAACTTATAAAAGAGCAAGGAACGAAGCCATTGAAAAAAGATTGGCTGTCATGATG-3' and 5'-GGAATTCAAAGTGTTTTGTATG-3'. The PCR product was purified on a column (PCR purification kit; Promega). This product containing the desired mutation and the oligonucleotide 5'-AGCACCGGAGCCATGCACTA-3' were used as primers with LP-H as a template. The PCR product was digested with NsiI and EcoRI and subcloned into LP-H cut with the same restriction enzymes.

Alanine scan mutants of LEU3 were obtained using the following oligonucleotides: 78A, 5'-AGAAGAACTTATGCAAGAGCAAGGAAC-3'; 79A, 5'-AGAACTTATAAAGCAGCAAGGAACGAA-3'; 81A, 5'-TATAAAAGAGCAGCGAACGAAGCCATT-3'; 82A, 5'-AAAAGAGCAAGGGCCGAAGCCATTGAA-3'; 83A, 5'-AGAGCAAGGAACGCAGCCATTGAAAAA-3'; 85A, 5'-AGGAACGAAGCCGCTGAAAAAAGATTC-3'; 86A, 5'-AACGAAGCCATTGCAAAAAGATTCAAG-3'; 87A, 5'-GAAGCCATTGAAGCAAGATTCAAGGAA-3'; 88A, 5'-GCCATTGAAAAAGCATTCAAGGAACTC-3'; 89A, 5'-ATTGAAAAAAGAGCCAAGGAACTCACC-3'; 90A, 5'-GAAAAAAGATTCGCGGAACTCACCAGA-3'; 91A, 5'-AAAAGATTCAAGGCACTCACCAGAACT-3'; 92A, 5'-AGATTCAAGGAAGCCACCAGAACTTTG-3'.

Protein Purification-- Expression and purification of the proteins were performed as described (9) except that cells were freeze/thawed three times and sonicated for 10 s. For all proteins, the GST moiety was removed by cleavage with thrombin. The polypeptides were 50-90% pure as judged by SDS-polyacrylamide gel analysis followed by Coomassie staining. Ppr1p has a significantly faster mobility than Leu3p on a denaturing gel even though they have a very similar molecular size (Leu3p, 17 kDa; Ppr1p, 16 kDa). This may be due to abnormal migration because of the high content of glutamic acids in the DNA binding domain of Leu3p as observed for the chicken progesterone receptor (40). Chimeric proteins migrated at intermediary positions relative to Leu3p and Ppr1p (data not shown).

Electrophoretic Mobility Shift Assays-- DNA sequences of the probes used in EMSA are TCGACCTGCCGGTACCGGCTTGGTCGA (Leu3p site: UAS of ILV2; Ref. 33) and TCTTCGGCAATTGCCGAAGA (Ppr1p site, Ref. 11). EMSAs were performed using salmon sperm DNA as nonspecific competitor as described (9) with increasing concentrations (1-, 2-, 4-, 8-, and 16-fold) of recombinant proteins. The gels were run at 4 °C. Dissociation constants (Kd values) were estimated (34) by performing EMSAs in the absence of competitor DNA with a fixed amount of each protein and decreasing concentrations of the appropriate probe. The ratio of bound to free DNA was plotted against [bound DNA] and the Kd estimated from the slope of the graph. Percentage of binding was measured using a Phosphorimager (Fuji).

    RESULTS
Top
Abstract
Introduction
Procedures
Results
Discussion
References

We were interested in determining which region of the DNA binding domain of Leu3p and Ppr1p controlled recognition of inverted and everted CGG triplets. We constructed chimeric proteins between Ppr1p and Leu3p. Thus, the DNA binding domains of Ppr1p (aa 1-144), Leu3p (aa 1-147) (hereafter referred to as Ppr1p and Leu3p, respectively) or chimeras were expressed in bacteria as GST fusion proteins. The polypeptides were then purified and the GST moiety removed by cleavage with thrombin. The purified polypeptides were then used in an EMSA using the UAS of ILV2 (Leu3p site) (or mutants) and a consensus site for Ppr1p (12) (see Fig. 1B).


View larger version (23K):
[in this window]
[in a new window]
 
Fig. 1.   A, amino acid sequence comparison of the core of the Leu3p and Ppr1p DNA binding domains. Cysteine residues that coordinate zinc atoms are shown underlined and in bold characters. Hydrophobic residues found at positions a and d of the heptad repeats (dimerization domain) are in bold characters. Borders between the zinc cluster domain, the linker region, and the dimerization domains were defined according to previous studies (11, 12, 27, 28) and this work. B, probes used in electrophoretic mobility shift assays. Arrows correspond to the CGG triplets recognized by Leu3p or Ppr1p. The Leu3p probe is derived from the UAS of ILV2 (33) flanked by the sequence TCGA. Mutant probes have been described previously (9), and mutations are underlined. The Ppr1p probe is a consensus site for that protein (12).

The Linker or the Dimerization Domains of Leu3p Specify Binding to Everted Repeats-- A first set of chimeric proteins containing N-terminal sequences of Ppr1p of increasing length were constructed as shown in Fig. 2A. Control experiments demonstrated that Leu3p bound to its target site but not to a Ppr1p site and vice versa (Fig. 2B). Deletion of the first 28 aa of Leu3p resulted in a complex of faster mobility (Fig. 2B) but did not alter the specificity or the affinity of binding of the truncated protein (construct Leu3-1, Fig. 2, A and B). A chimeric protein, where the N terminus of Leu3p was substituted with that of Ppr1p (construct LP-A), gave two major complexes with a probe corresponding to a Leu3p site. In order to test if the fast migrating complex corresponds to a truncated protein or a monomer, we used a mutant probe carrying a mutation in one everted CGG triplet. If the fast migrating complex corresponds to a monomer, significant binding should be observed with the probe because one CGG triplet is left intact. However, this mutant probe (probe 8A, Fig. 1B), known to greatly impair binding of Leu3p in vitro (9), also drastically reduced binding of LP-A suggesting that the fast migrating complexes correspond to a truncated form of LP-A. No complex was detected in the presence of LP-A when a Ppr1p site was used (Fig. 2B), suggesting that the N terminus of Leu3p is not involved in discriminating between a Leu3p and a Ppr1p site.


View larger version (37K):
[in this window]
[in a new window]
 
Fig. 2.   Binding of Leu3p-Ppr1p chimeras to a Leu3p or a Ppr1p site. A, schematic view of the chimeras. Subdomains of the DNA binding domains of Leu3p or Ppr1p are shown on top. Numbers correspond to the amino acids of Leu3p and Ppr1p present in a given chimera. Kd values are listed at right. No binding is indicated by -. B, EMSA was performed with a Leu3p site, a Leu3p mutant site (mutant 8A, Fig. 1B) or a Ppr1p site as indicated on the bottom using wild type Leu3p, Ppr1p, or chimeras as indicated on top of the figure. Triangles correspond to increasing protein concentrations (1-, 2-, 4-, 8-, and 16-fold). The lowest protein concentrations used were 45 nM for Leu3 and Leu3-1, 90 nM for LP-A, 160 nM for Ppr1. and 60 nM for LP-B. For the sequence of the probes, refer to Fig. 1B.

Moreover, the chimera LP-B, which contains the N terminus and the zinc cluster of Ppr1p followed by the linker and dimerization domains of Leu3p, bound to a Leu3p site with high affinity. Because the Leu3p site also contains an inverted CGG triplet spaced by 2 bp, we also tested mutant probes to verify that the chimeric proteins LP-A, LP-B (Fig. 2A) and LP-I (Fig. 3A) recognize the everted but not the inverted motif. Binding was abolished with a mutation that targets the everted repeat (mutant probe 8A, Fig. 1B) but not with a mutation targeting the inverted repeat (probe 5A, Fig. 1B) (data not shown). These results strongly suggest that the chimeric proteins LP-A and LP-B have the same binding specificity as wild type Leu3p. Interestingly, when the junction between the Ppr1p and the Leu3p chimera was shifted so that the linker region contains 4 additional aa derived from Ppr1p (chimera LP-C, Fig. 2A), no binding was detected at a Leu3p or a Ppr1p site (data not shown). Thus, the data define aa 70 as the N-terminal boundary of Leu3p sequences required for binding to an everted repeat. These results indicate that the N-terminal region and the zinc cluster of Leu3p are not involved in specifying binding to an everted repeat.


View larger version (35K):
[in this window]
[in a new window]
 
Fig. 3.   Binding of Leu3p-Ppr1p chimeras to a Leu3p or a Ppr1p site. A, schematic view of the chimeras. Subdomains of the DNA binding domains of Leu3p or Ppr1p are shown on top. Numbers correspond to the amino acids of Leu3p and Ppr1p present in a given chimera. Kd values are listed at right. No binding is indicated by -. NT, Kd not tested. B, EMSA was performed with a Leu3p site, a Leu3p mutant site (mutant 8A, Fig. 1B) or a Ppr1p site as indicated on the bottom using Leu3p-Ppr1p chimeras as indicated on top of the figure. Triangles correspond to increasing protein concentrations (1-, 2-, 4-, 8-, and 16-fold). The lowest protein concentrations used were 250 nM for LP-F and LP-G and 60 nM for LP-I. For the sequence of the probes, refer to Fig. 1B.

Other chimeric proteins, LP-D, and LP-E (Fig. 2A), which contain Ppr1p sequences up to the sixth aa of the dimerization domain, did not form a complex with a mobility comparable to wild type proteins on either probe (data not shown). In summary, these experiments define the region of Leu3p critical for recognition of an everted repeat at the C terminus of its DNA binding domain and include the linker and the dimerization domain (aa 70-147).

The Linker of Leu3p Specifies Binding to Everted Repeats-- In order to map more precisely the region of Leu3p responsible for binding to an everted repeat, we tested a series of converse chimeras that contain increasing portions of the Leu3p sequences at the N terminus of Ppr1p. Swapping the N terminus of Ppr1p with the corresponding region of Leu3p (chimera LP-F) did not prevent binding to a Ppr1p site. In addition, LP-G, which contains both the N-terminal and the zinc cluster regions of Leu3p (followed by Ppr1p sequences), bound to a Ppr1p site (with reduced affinity) but not a Leu3p site (Fig. 3B). However, addition of the linker of Leu3p (chimera LP-I) switched the specificity of binding to a Leu3p site. Two complexes are detected when using a Leu3p site. The slow complex migrates at a position similar to Leu3p and is not detected with a mutant probe (Fig. 3B), strongly suggesting that it corresponds to a dimer. In contrast, the fast migrating complex is also seen with a mutant of a Leu3p site as well as with a Ppr1p site. Thus, the fast migrating complex probably corresponds to a monomer. Similar observations were made for some Hap1p-Ppr1p chimeras (28). Because LP-I contains most of the dimerization domain of Ppr1p and recognizes a Leu3p site, this strongly suggests that the dimerization domain of Leu3p (with the exception of the first 6 aa) is not responsible for the specific orientation of the zinc clusters of a Leu3p homodimer. A chimera (LP-H) carrying a shorter segment of the Leu3p linker did not bind to either probe tested (Fig. 3A). Taken together, our results show that the critical region of Leu3p that specifies binding to everted repeats maps to the linker region and the beginning of the dimerization domain between aa 70 and 87.

Alanine Scan Mutagenesis of the Linker-Dimerization Junction of Leu3p-- We then focused our analysis on the Leu3p region that encompasses the linker and the dimerization domain. Each of these residues (aa 78-92) was mutated to alanine and the binding of mutant proteins was analyzed by EMSA as shown in Fig. 4. Three mutants (I85A, F89A, and L92A) gave rise to higher mobility DNA-protein complexes (Fig. 4). The hydrophobic residues isoleucine 85, phenylalanine 89, and leucine 92 correspond to positions "a" or "d" of a heptad repeat (Fig. 4), which was shown to be involved in dimerization of Gal4p, Ppr1p, and Put3p (11, 12, 25). We suggest that these mutants are defective in dimerization and, as a result, bind to DNA as monomers as observed for Hap1p with mutant sites (28). Thus, the beginning of the dimerization domain would map to aa 82. Moreover, our alanine scan identified one aa of the linker region as a critical residue. Indeed, binding was abolished with a change from arginine to alanine at position 81, just before the dimerization domain. This is in agreement with the absence of binding of the chimera LP-H that lacks aa 80-147 of Leu3p (Fig. 3A). Substitution of aa 86 and 87 to alanine reduced DNA binding. Many other changes (positions 78, 79, 86, 90, and 91) had no or minor effects on binding of Leu3p or resulted in increased binding (positions 82, 83, and 88).


View larger version (56K):
[in this window]
[in a new window]
 
Fig. 4.   Alanine scan mutagenesis of the linker-dimerization junction of Leu3p. The amino acid sequence of the linker and part of the dimerization domain of Leu3p is given on top of the figure. The last cysteine of the zinc cluster is circled. Arrows correspond to the heptad repeats present in the dimerization domain of Leu3p. EMSA is shown at the bottom with each amino acid tested changed to alanine and numbered according to the initiation codon. WT, wild type Leu3p. The probe used in the EMSA is the UAS of ILV2 (see Fig. 1B). Percentage of homodimer binding is given at the bottom of the figure.

    DISCUSSION
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Zinc cluster proteins usually bind as homodimers to specific DNA sequences. A major determinant of binding is the recognition of a CGG triplet by each of the two monomers. Two mechanisms that allow zinc cluster proteins to bind to diverse DNA sequences have been identified: 1) variation of the relative orientation of the CGG triplets (inverted, everted, and direct repeats) and 2) alternate spacing between the CGG triplets. Previous studies have shown that the linker region, located between the zinc cluster and the dimerization domain, specifies binding to sites with a given spacing, as shown for Gal4p, Put3p, and Ppr1p that bind to inverted sites with a spacing of 11, 10, and 6 bp, respectively (27).

We show here that the linker of Leu3p specifies the relative orientation of the zinc clusters. The chimera LP-G (Fig. 3A), which contains the N terminus and the zinc cluster of Leu3p followed by the linker and the dimerization domain of Ppr1p, bound to a Ppr1p site. Specificity of binding was switched to a Leu3p site when the fusion protein contained the linker region of Leu3p (LP-I, Fig. 3A). Similar results were obtained with converse fusions. For instance, a fusion protein (LP-B), consisting of the N terminus and the zinc cluster of Ppr1p followed by Leu3p sequences, recognized a Leu3p site. However, chimeric proteins LP-D, and LP-E (Fig. 2A) containing Ppr1p sequences (including the linker) followed by the dimerization domain of Leu3p failed to bind to DNA. It has been shown that Ppr1p is almost insensitive to mutations outside the CGG triplets (13). In contrast, we have shown that, for Leu3p, nucleotides located between the two everted CGG triplets are important for binding in vitro and activity in vivo (41). For example, there is a requirement for a C 5' to the CGG triplet (CCGG) for binding in vitro of Leu3p. The Ppr1p site has the sequence TCGG, which may prevent binding of chimeras LP-D and LP-E even if their zinc clusters are properly oriented to recognize an inverted repeat. The Leu3p linker region may also specify the spacing between the everted CGG triplets preventing binding to sites with alternate spacing.

Role of the Linker and Dimerization Domains-- Alanine scan mutagenesis of the linker-dimerization region of Leu3p has revealed aa important for DNA binding. For instance, an arginine (position 81), located in the linker region, is essential for binding of Leu3p. Other key aa are located in the dimerization domain, which contains the heptad repeat region. Mutagenesis of isoleucine 85, phenylalanine 89, and leucine 92 disrupts Leu3p dimerization. However, the Leu3p dimerization domain does not play a role in specifying binding to everted repeats. Substitution of the dimerization domain of Leu3p by the corresponding region of Ppr1p does not disrupt binding to a Leu3p site (chimera LP-I, Fig. 3A). Similarly, most of the dimerization domain of Gal4p or Hap1p can be replaced by that of Ppr1p without affecting binding specificity (7, 27). Many characterized zinc cluster proteins bind to DNA as homodimers. However, there is good evidence that the yeast zinc cluster proteins Oaf1p and Oaf2p (35) can form heterodimers. One possibility is that some dimerization domains of zinc cluster proteins direct formation of specific heterodimers resulting in increased diversity of binding sites as observed for other transcriptional regulators like the nuclear receptors and leucine zipper proteins (36, 37).

Role of the Zinc Cluster Region-- Similar to the dimerization domain, the zinc clusters of Leu3p and Ppr1p can be exchanged without affecting binding specificity. These results are supported by our previous data which showed that the replacement of the zinc cluster of Gal4p by the one of Leu3p does not affect binding to a Gal4p site (9). Many residues known to interact with the CGG triplet, like a lysine at the fourth aa after the second cysteine of the zinc cluster, are conserved in Gal4p, Ppr1p, and Leu3p. This conservation extends to Hap1p (1) but, in contrast to Gal4p, Ppr1p, and Leu3p, this region of Hap1p has also been shown to be responsible for orienting the zinc clusters (28). It has been proposed (28) that the zinc clusters of Hap1p interact with each other allowing binding to asymmetric sites unlike Leu3p and Ppr1p that recognize symmetric targets as depicted in Fig. 5. In addition, zinc cluster proteins Put3p, Tea1p, and Cha4p recognize a similar site: inverted CGG triplets spaced by 10 bp (15, 38, 39), while Leu3p and Uga3p recognize an everted repeat spaced by 4 bp (41). However, Tea1p does not bind to a Put3p site (38) and Leu3p does not recognize a Uga3p site (41). It is possible that an additional role of the zinc cluster (or maybe the dimerization domain) is to discriminate between highly related sites by contacting nucleotides that flank the CGG triplets.


View larger version (16K):
[in this window]
[in a new window]
 
Fig. 5.   Model for binding of zinc cluster proteins to inverted, direct, or everted repeats. Drawings are based on the crystal structure of the DNA binding domains of Gal4p (11) and Ppr1p (12), the analysis of Hap1p-Ppr1p chimeras (28), and Leu3p-Ppr1p chimeras (this study). Shaded bars indicate interactions between the zinc clusters of Hap1p.

A Model for Binding of Zinc Cluster Proteins to DNA-- From the data presented in this study and others, we propose a model for binding of the zinc cluster proteins to three different DNA motifs as shown in Fig. 5. We propose that the C-terminal region of the linker, at the border of the dimerization domain, projects the two zinc clusters of the Leu3p homodimer so that they are oriented in opposite directions allowing binding to an everted repeat. Arginine at position 81, which was shown to be critical for binding of Leu3p (Fig. 4), may be involved in projecting the zinc cluster to the right orientation. Because Leu3p does not bind to mutants of the UAS of LEU2 carrying a shorter (3 bp) or a longer (5 bp) spacing between the CGG triplets (9), the linker region must have a rigid structure that prevents binding to sites with different spacings and, most probably, to sites with alternate DNA motifs. Similarly, the zinc clusters of Gal4p or Ppr1p are symmetrically arranged with the linker that controls the head-to-head orientation of the two zinc clusters (Fig. 5). In addition, the linkers of Gal4p, Ppr1p, and Leu3p determine the distance between the zinc clusters and, consequently, restrict the recognition site to CGG triplets with specific spacings. Comparison of the linker regions of members of the zinc cluster family shows no obvious homology (1). Therefore, it is not possible to predict the motif recognized by a given zinc cluster protein by aa comparison.

In conclusion, recognition of CGG triplets is achieved by the homologous zinc cluster region of members of the Gal4p/Ppr1p/Leu3p family while the adjacent linker region controls the relative orientation of the zinc clusters thus allowing recognition of inverted repeats or everted repeats. It will be interesting to correlate the analysis of Leu3p-Ppr1p chimeras with the crystal structure of the Leu3p DNA binding domain.

    ACKNOWLEDGEMENTS

We are grateful to Dr. J. White for critical review of the manuscript and to Dr. A. Nepveu and members of the Laboratory of Molecular Endocrinology for very helpful discussions. We are grateful to Dr. H. Zingg and K. Chu for advice.

    FOOTNOTES

* This work was supported in part by grants from the National Science and Engineering Research Council of Canada, the Canadian Genome Analysis and Technology program, and the Medical Research Council of Canada.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Scholar of the Medical Research Council of Canada. To whom correspondence should be addressed: Dept. of Medicine, Royal Victoria Hospital, McGill University, 687 Pine Ave. West, Montréal, Québec, Canada H3A 1A1. Tel.: 514-842-1231 (ext. 5046); Fax: 514-982-0893; E-mail: turcotte{at}lan1.molonc.mcgill.ca.

1 The abbreviations used are: aa, amino acid(s); EMSA, electrophoretic mobility shift assay; UAS, upstream activating sequence; bp, base pair(s); GST, glutathione S-transferase; PCR, polymerase chain reaction; aa, amino acid(s).

    REFERENCES
Top
Abstract
Introduction
Procedures
Results
Discussion
References

  1. Schjerling, P., and Holmberg, S. (1996) Nucleic Acids Res. 24, 4599-607[Abstract/Free Full Text]
  2. Todd, R. B., and Andrianopoulos, A. (1997) Fungal Gen. Biol. 21, 388-405[CrossRef]
  3. Vallee, B. L., Coleman, J. E., and Auld, D. S. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 999-1003[Abstract]
  4. Ozcan, S., and Johnston, M. (1996) Mol. Cell. Biol. 16, 5536-5545[Abstract]
  5. Lenouvel, F., Nikolaev, I., and Felenbok, B. (1997) J. Biol. Chem. 272, 15521-15526[Abstract/Free Full Text]
  6. Ha, N., Hellauer, K., and Turcotte, B. (1996) Nucleic Acids Res. 24, 1453-1459[Abstract/Free Full Text]
  7. Zhang, L., and Guarente, L. (1994) Genes Dev. 8, 2110-2119[Abstract]
  8. Tini, M., Otulakowski, G., Breitman, M. L., Tsui, L.-C., and Giguere, V. (1993) Genes Dev. 7, 295-307[Abstract]
  9. Hellauer, K., Rochon, M.-H., and Turcotte, B. (1996) Mol. Cell. Biol. 16, 6096-6102[Abstract]
  10. Remboutsika, E., and Kohlhaw, G. B. (1994) Mol. Cell. Biol. 14, 5547-5557[Abstract]
  11. Marmorstein, R., Carey, M., Ptashne, M., and Harrison, S. C. (1992) Nature 356, 408-414[CrossRef][Medline] [Order article via Infotrieve]
  12. Marmorstein, R., and Harrison, S. C. (1994) Genes Dev. 8, 2504-2512[Abstract]
  13. Liang, S. D., Marmorstein, R., Harrison, S. C., and Ptashne, M. (1996) Mol. Cell. Biol. 16, 3773-3780[Abstract]
  14. Vashee, S., Xu, H., Johnston, S. A., and Kodadek, T. (1993) J. Biol. Chem. 268, 24699-24706[Abstract/Free Full Text]
  15. Siddiqui, A. H., and Brandriss, M. C. (1989) Mol. Cell. Biol. 9, 4706-4712[Medline] [Order article via Infotrieve]
  16. Roy, A., Exinger, F., and Losson, R. (1990) Mol. Cell. Biol. 10, 5257-5270[Medline] [Order article via Infotrieve]
  17. Bram, R. J., Lue, N. F., and Kornberg, R. D. (1986) EMBO J. 5, 603-608[Abstract]
  18. Halvorsen, Y. C., Nandablan, K., and Dickson, R. C. (1991) Mol. Cell. Biol. 11, 1777-1784[Medline] [Order article via Infotrieve]
  19. Lodi, T., and Guiard, B. (1991) Mol. Cell. Biol. 11, 3762-3772[Medline] [Order article via Infotrieve]
  20. Pfeifer, K., Prezant, T., and Guarente, L. (1987) Cell 49, 19-27[Medline] [Order article via Infotrieve]
  21. Prezant, T., Pfeifer, K., and Guarente, L. (1987) Mol. Cell. Biol. 7, 3252-3259[Medline] [Order article via Infotrieve]
  22. Schneider, J. C., and Guarente, L. (1991) Mol. Cell. Biol. 11, 4934-4942[Medline] [Order article via Infotrieve]
  23. Winkler, H., Adam, G., Mattes, E., Schanz, M., Hartig, A., and Ruis, H. (1988) EMBO J. 7, 1799-1804[Abstract]
  24. Zitomer, R. S., Sellers, J. W., McCarter, D. W., Hastings, G. A., Wick, P., and Lowry, C. V. (1987) Mol. Cell. Biol. 7, 2212-2220[Medline] [Order article via Infotrieve]
  25. Swaminathan, K., Flynn, P., Reece, R. J., and Marmorstein, R. (1997) Nature Struct. Biol. 4, 751-759[Medline] [Order article via Infotrieve]
  26. Schwabe, J. W., and Rhodes, D. (1997) Nat. Struct. Biol. 4, 680-683[CrossRef][Medline] [Order article via Infotrieve]
  27. Reece, R. J., and Ptashne, M. (1993) Science 261, 909-911[Medline] [Order article via Infotrieve]
  28. Zhang, L., and Guarente, L. (1996) EMBO J. 15, 4676-4681[Abstract]
  29. Kammerer, B., Guyonvarch, A., and Hubert, J. C. (1984) J. Mol. Biol. 180, 239-250[Medline] [Order article via Infotrieve]
  30. Sikorski, R. S., and Hieter, P. (1989) Genetics 122, 19-27[Abstract/Free Full Text]
  31. Kunkel, T. A. (1985) Proc. Natl. Acad. Sci. U. S. A. 82, 488-492[Abstract]
  32. Seraphin, B., and Kandels-Lewis, S. (1996) Nucleic Acids Res. 24, 3276-3277[Abstract/Free Full Text]
  33. Friden, P., and Schimmel, P. (1987) Mol. Cell. Biol. 7, 2708-2717[Medline] [Order article via Infotrieve]
  34. Harada, R., Berube, G., Tamplin, O. J., Denis-Larose, C., and Nepveu, A. (1995) Mol. Cell. Biol. 15, 129-140[Abstract]
  35. Karpichev, I. V., Luo, Y., Marians, R. C., and Small, G. M. (1997) Mol. Cell. Biol. 17, 69-80[Abstract]
  36. Baxevanis, A. D., and Vinson, C. R. (1993) Curr. Opin. Genet. Dev. 3, 278-285[Medline] [Order article via Infotrieve]
  37. Mangelsdorf, D. J., and Evans, R. M. (1995) Cell 83, 841-850[Medline] [Order article via Infotrieve]
  38. Gray, W. M., and Fassler, J. S. (1996) Mol. Cell. Biol. 16, 347-358[Abstract]
  39. Holmberg, S., and Schjerling, P. (1996) Genetics 144, 467-478[Abstract/Free Full Text]
  40. Gronemeyer, H., Turcotte, B., Quirin-Stricker, C., Bocquel, M. T., Meyer, M. E., Krozowski, Z., Jeltsch, J. M., Lerouge, T., Garnier, J. M., and Chambon, P. (1987) EMBO J. 6, 3985-3994[Abstract]
  41. Noël, J., and Turcotte, B. (1998) J. Biol. Chem. 273, 17463-17468[Abstract/Free Full Text]


Copyright © 1998 by The American Society for Biochemistry and Molecular Biology, Inc.