A Single Residue Substitution Causes a Switch from the Dual DNA Binding Specificity of Plant Transcription Factor MYB.Ph3 to the Animal c-MYB Specificity*

(Received for publication, August 1, 1996)

Roberto Solano Dagger , Antonio Fuertes , Luis Sánchez-Pulido , Alfonso Valencia and Javier Paz-Ares §

From the Centro Nacional de Biotecnología-CSIC, Campus Cantoblanco, Carretera de Colmenar Km 15.5, Madrid 28049, Spain

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
Acknowledgments
Note Added in Proof
REFERENCES


ABSTRACT

Transcription factor MYB.Ph3 from Petunia binds to two types of sequences, MBSI and MBSII, whereas murine c-MYB only binds to MBSI, and Am305 from Antirrhinum only binds to MBSII. DNA binding studies with hybrids of these proteins pointed to the N-terminal repeat (R2) as the most involved in determining binding to MBSI and/or MBSII, although some influence of the C-terminal repeat (R3) was also evident. Furthermore, a single residue substitution (Leu71 right-arrow Glu) in MYB.Ph3 changed its specificity to that of c-MYB, and c-MYB with the reciprocal substitution (Glu132 right-arrow Leu) essentially gained the MYB.Ph3 specificity. Molecular modeling and DNA binding studies with site-specific MYB.Ph3 mutants strongly supported the notion that the drastic changes in DNA binding specificity caused by the Leu right-arrow Glu substitution reflect the fact that certain residues influence this property both directly, through base contacts, and indirectly, through interactions with other base-contacting residues, and that a single residue may establish alternative base contacts in different targets. Additionally, differential effects of mutations at non-base-contacting residues in MYB.Ph3 and c-MYB were observed, reflecting the importance of protein context on DNA binding properties of MYB proteins.


INTRODUCTION

One characteristic of most eukaryotic transcription factors is that they can be grouped into a small number of families, each including factors with sequence similarity over their DNA-binding domain (reviewed in Refs. 1-3). In a given species, different members of the same family usually regulate unique, often partially overlapping, groups of target genes, at least in part due to distinct, although related, DNA binding specificities (4, 5). The basis of distinctiveness/similarity in DNA binding specificity among members of each family of transcription factors is not yet fully understood, although some progress toward rationalizing this problem has already been made (Refs. 6 and 7 and references therein).

One of the families of transcription factors is that of MYB proteins, so named because the first member of the family to be discovered was the product of the avian myeloblastosis oncogene v-myb. Subsequently, members of this family, sharing the MYB DNA-binding domain, have been found in all eukaryotes investigated, from yeast to humans (reviewed in Refs. 8-10).

Structurally, the best characterized member of the family is c-MYB, the cellular homologue of v-MYB, for which the solution structure of its DNA-binding domain has been solved, both in the free form and in complex with DNA (11-15). The c-MYB DNA-binding domain consists of three imperfect repeats (R1, R2, and R3), each of which folds into a variant of the homeodomain helix-turn-helix motif, similar to that of the prokaryotic LexA protein (7, 11-13, 15, 16). The third helix of the R2 repeat, the recognition helix, however, shows certain conformational flexibility in the free form, which is stabilized upon binding to DNA, and the same is true for the equivalent helix of B-MYB (14, 15, 17, 18). MYB repeats are also characterized by the presence of three conserved tryptophan residues regularly spaced by 18 or 19 amino acids that play a relevant role in the folding of the hydrophobic core of the MYB domain (11, 12, 19, 20). In their interaction with DNA, the recognition helices of both R2 and R3 lie on the major groove of the DNA and interact with each other, resulting in a cooperative binding to DNA sequences with the consensus pentanucleotide core CNGTT (12). The R1 repeat, which is missing in all plant MYB proteins (for examples, see Ref. 21) has no observable effect on DNA binding specificity, although it contributes to the stability of the protein·DNA complex (12, 22-24). The three key base contacts are established by residues Lys128 (R2), Lys182 (R3), and Asn183 (R3), which are fully conserved in all known plant and animal MYB proteins (Refs. 12, 21, and 25 and references therein). However, whereas all known animal (R1, R2, and R3) MYB proteins recognize the same type of core sequence as c-MYB, in plants, there are at least some MYB proteins that show binding specificity differing from that of c-MYB (22, 25-33).

A striking case of this divergence in binding specificity is that of MYB.Ph3, which is a transcription factor predominantly found in epidermal cells of Petunia flowers. Like some other plant MYB proteins, such as the C1, Pl, and P proteins from maize, the Am305 protein from Antirrhinum, and others, MYB.Ph3 possibly regulates the flavonoid (phenylpropanoid) biosynthetic pathway (21, 31, 33-37). MYB.Ph3 has been shown to bind to two types of site: MBSI1 (A(a/D)(a/D)C(G/C)GTTA, where a/D is A, G, or T, A being the preferred base), which conforms to the core consensus sequence CNGTT and is bound by c-MYB; and MBSII (AGTTAGTTA), which resembles the binding site of P and Am305 proteins and which is not bound by c-MYB (30, 31, 33). Our previous studies support the idea that binding of MYB.Ph3 to MBSI and MBSII does not involve alternative orientations of its two MYB repeats, despite of the resemblance of these sites to inverted and direct repeats of the GTTA motif, respectively (33, 38).

Here we report on the analysis of the molecular determinants that enable MYB.Ph3 to recognize two different types of sequence. Remarkably, a single residue substitution in the R2 repeat of MYB.Ph3 (Leu71 right-arrow Glu) changes its specificity to that of c-MYB, and that the reciprocal substitution in c-MYB, Glu132 right-arrow Leu, essentially confers MYB.Ph3 specificity. We provide evidence that the ability of a single residue substitution to have such great effects on DNA binding specificity reflects the fact that certain residues influence this property directly, through base contacts, and also indirectly, through interactions with other base-contacting residues, and that some residues can establish alternative base contacts in different targets. In addition, we show that substitutions in (presumably) non-base-contacting residues can also affect the DNA binding properties of MYB proteins, and that their effect may be different in c-MYB and MYB.Ph3, thereby underlining the importance of protein context in determining DNA binding.


MATERIALS AND METHODS

Plasmid Constructs and in Vitro Synthesis of Proteins

Constructs coding for Petunia MYB.Ph3Delta C1, murine c-MYBDelta R1C1, and Antirrhinum Am305Delta C1 have been described previously (31, 33). Constructs coding for the mutant derivatives of these proteins, used in this study, were obtained by PCR-mediated, site-directed mutagenesis of the corresponding cDNAs as described by Cormack (39). To prepare the constructs encoding MYB chimeric proteins, the cDNA fragments corresponding to the parts of the MYB proteins present in the chimeras were obtained by PCR amplification with one phosphorylated oligonucleotide, that corresponding to the internal part of the chimera. After ligation of the two fragments present in each chimera, a second PCR was performed with the oligonucleotides corresponding to the 5' and 3' ends of the chimeric cDNA. All cDNA fragments coding for mutant or chimeric proteins were cloned into the XbaI-BamHI sites (or XbaI-PstI for P2A3 and M2A3 chimeras) of the pBluescript vector. All PCR fragments used in the constructs were confirmed by sequencing.

RNAs, obtained by in vitro transcription of the corresponding constructs using T7 or T3 polymerase, were used for in vitro translation in the flexi-rabbit reticulocyte system (Promega) supplemented with magnesium acetate and potassium chloride to final concentrations of 2.05 and 75 mM, respectively, in the presence of [35S]methionine, following the manufacturer's instructions. After in vitro translation, SDS-PAGE analysis of the reticulocyte extracts was performed to allow estimation of the amount of each translated protein by measurement of 35S cpm in the corresponding protein band and correction for methionine content.

DNA Binding Reactions and EMSA

PCR labeling of MBSI, II, IG, and IIG oligonucleotides, DNA binding reactions, and electrophoretic mobility shift assays (EMSAs) were performed as described in Solano et al. (33). Each binding reaction (15 µl) contained 4 ng of labeled DNA, 400 ng of poly(dI·dC), 150 ng of denatured salmon sperm DNA, and rabbit reticulocyte lysate consisting of a measured amount of the in vitro translated protein, supplemented with lysate incubated in the absence of external RNA to give a final volume of 2 µl, so that all reactions had equimolar amounts of protein.

Molecular Modeling

The structures used for the analysis were the average NMR structure of c-MYB bound to DNA (GTCAGTTA), as deposited in Protein Data Bank (40) under code 1MSE by Ogata et al. (12), and the best 25 NMR solutions (Protein Data Bank code 1MSF). Modeling of MYB.Ph3 complexed with DNA (MSBI or MSBII) was carried out with the WHATIF package (41). The quality of the resulting structures was assessed by different standard structures based on normality of molecular contacts (42) and deviation from normal exposed hydrophobic surfaces (43). The analysis of alternative conformations for different residues corresponds to the WHATIF secondary structure-specific rotamer data base (version Feb. 1996).


RESULTS

Role of MYB Repeats in Sequence Recognition

To investigate the molecular determinants that allow MYB.Ph3 protein to recognize two different types of sequence, we first analyzed the role of R2 and R3 repeats on DNA binding. For this purpose, we took advantage of the differential affinity of murine c-MYB and Antirrhinum Am305 proteins for both types of MYB.Ph3 consensus binding sites (28, 31, 33). As shown by EMSAs (electrophoretic mobility shift assays, Fig. 1), derivative MYB.Ph3Delta C1 binds both types of consensus sequence (MBSI and MBSII) with the same affinity, whereas derivative c-MYBDelta C2R1 only binds to MBSI but not to MBSII, and derivative Am305Delta C1 shows the opposite behavior. Protein Am305 also differs from MYB.Ph3 in that it prefers a variant of MBSII with a G in position +2 (MBSIIG; Ref. 31; Fig. 1), whereas an additional difference between c-MYB and MYB.Ph3 is that a change of T for G at position +2 in MBSI (MBSIG) still allows a certain binding by c-MYB and not by MYB.Ph3 (24).


Fig. 1. Differential DNA binding specificity of MYB proteins. a, SDS-PAGE analysis of the 35S-labeled in vitro synthesized deletion derivatives of Petunia MYB.Ph3 (MYB.Ph3Delta C1; amino acids 1 to 180), murine c-MYB (c-MYBDelta C2R1; amino acids 89 to 236), and Antirrhinum Am305 proteins (Am305Delta C1; amino acids 1 to 159). The numbers indicate the apparent molecular mass, in kilodaltons, determined by comparison to prestained markers. b, sequence of MBSI (I), MBSII (II), and of their mutant derivatives MBSIG (IG) and MBSIIG (IIG) targets used in the binding assays. The numbers above indicate the position of a particular base in the sequence for the represented strand. In the complementary strands, the numbers would be the same but with a ' sign, i.e. -5', -4',  ... , +3'. +4'. c, EMSA using the four types of binding sites; the proteins are shown schematically on the left. Only the retarded bands are shown. In all reactions, equimolar amounts of the different proteins as well as of the different DNA fragments were used (see "Materials and Methods").
[View Larger Version of this Image (46K GIF file)]


We constructed hybrid proteins that combined R2 and R3 MYB repeats: P2 and P3 from MYB.Ph3; A2 and A3 from Am305; and M2 and M3 from c-MYB. These chimeric proteins, like their progenitors (Fig. 1), also contained amino acid sequences beyond the strict R2 and R3 repeats, originating from the 5' or the 3' coding parts of the cDNAs, except for M2, which only included an additional methionine from an engineered initiation codon (33). However, previous work with c-MYB (12, 22, 28), as well as the studies with site-directed mutants described in the next sections, showed that the effect of these additional sequences on DNA binding specificity was negligible. As shown in Fig. 2, all chimeric proteins, except P2A3, recognized at least one of the four sequences, albeit generally with lower affinity than their parental proteins, particularly M2A3. The type of sequence recognized (i.e. I or II) was mainly dependent on the type of R2 repeat in the chimera. Thus, proteins with the R2 repeat of MYB.Ph3 (P2) were able to bind type I and type II sequences, as can MYB.Ph3, whereas proteins with the R2 repeat of c-MYB (M2) or Am305 (A2) showed a preference for type I or type II, as found for A2A3 or M2M3, respectively. Because type I and II sequences differ at their 5' halves, R2 should be mainly implicated in the interaction with the 5' half of the sites. On the other hand, M2P3 and M2A3, which share the same R2 repeat, showed differential affinity for I and IG sequences, respectively, thus implicating the R3 repeat in the interaction with the 3' part of the targets. The same conclusion could be drawn from a comparison of A2P3 with A2A3.


Fig. 2. Role of the R2 and R3 MYB repeats in DNA binding as detected from the analysis of R2/R3 chimeric MYB proteins. a, SDS-PAGE analysis of the chimeric and original proteins used. Some proteins show bands of higher molecular weight that could represent protein-protein interactions (dimers and others). b, EMSA using the four types of binding sites (Fig. 1) and the proteins shown schematically on the left. P2, P3, A2, A3 and M2, and M3 represent the R2 and R3 repeats of the MYB domain of the MYB.Ph3, Am305, and c-MYB proteins, respectively. The autoradiographies corresponding to P2M3 and M2A3 were 3-fold overexposed.
[View Larger Version of this Image (17K GIF file)]


However, there must be some functional interdependence between repeats R2 and R3 in their interaction with the 5' and 3' halves of the sequence, respectively. For instance, the A2P3 protein bound to MBSIIG with higher affinity than P2P3 or M2P3, indicating some role of R2 (A2) in the interaction with the 3' half of the sequence. The same conclusion could be drawn by comparing A2M3 with P2M3 and M2M3, or P2P3 with M2P3. On the other hand, the higher binding affinity of A2P3 than A2A3 to MBSI can be taken as an example of the influence of the R3 repeat on the interaction with the 5' half of the sequence.

Residue Asn125 in the R3 Repeat of MYB.Ph3 Determines Preferential Recognition of T at Position +2

An analysis of R2- and R3-specific residues responsible for the differential binding specificity of MYB.Ph3 versus c-MYB and/or Am305 was then undertaken. The amino acid residue of the R3 repeat determining the preference for T rather than G at position +2 was investigated. A comparison of the amino acid sequence of R3-recognition helices from MYB.Ph3 and Am305, which respectively prefer T and G at position +2, revealed several differences (Fig. 3). Among these, residue Asn125 of MYB.Ph3, which is substituted by an Arg residue in Am305, was selected for site-directed mutagenesis, based on previous evidence that implicated the equivalent residue from c-MYB in the interaction with +2T (see Fig. 7; position -5T in Ref. 12). The MYB.Ph3 (Asn125 right-arrow Arg) mutant now preferred the MBSG to the MBS sequences (Fig. 3), thereby revealing the influence of residue Asn125 from MYB.Ph3 (or Arg from Am305) in the specificity of +2 contacts. Mutations of residue Asn125 to Ser, Ile, or His decreased overall affinity without affecting specificity, indicating that this residue is not the only determinant of position +2, in agreement with the studies using chimeric proteins described in the previous section (see also Fig. 7).


Fig. 3. Mutations in Asn125 of MYB.Ph3 influence base preferences at position +2 of MBSI and MBSII. a, alignment of the amino acids of the C-terminal part of the R3 MYB repeat of Antirrhinum Am305 and Petunia MYB.Ph3 proteins. All residues differing in MYB.Ph3 with respect to Am305 are boxed; the mutated residue is also shadowed. b, SDS-PAGE analysis of the mutant and original proteins used. c, EMSA using the four types of binding sites (Fig. 1); the proteins are shown schematically on the left.
[View Larger Version of this Image (42K GIF file)]



Fig. 7. Structure of the MYB·DNA complex. Top, ribbon plot of the minimal DNA binding domain of c-MYB bound to the target GTCAGTTA (12). The structure represented is the average of 25 NMR solutions (1MSE; Ref. 12). Some residues are represented as sticks: guanosines -4 and -2' in the DNA. Lys128, Glu132, and Gln129 are presented in two alternative conformations taken from the average structure (green) and from some of the 25 NMR solutions (1MSF) (white). In the NMR solutions, numbers 15 and 22 of 1MSF Lys128 interacts with -4G. In these two solutions, Glu132 sits far away from Lys128; in the plot, the positions of Lys128 and Glu132 are taken from the NMR solution 1MSF-22. The alternative position of Gln129 is taken from 1MSF-13. This is one of the solutions in which Gln129 comes close to the average position of Lys128 or Glu132 (1MSF-1, -4, -5, -6, -18, -19, and -23). Bottom, schematic interpretation of protein-DNA interactions as deduced from the c-MYB-DNA structure (12) and modeling studies of MYB.Ph3 (see "Materials and Methods"). Only residues of the R2 and R3 recognition helices are represented; base-contacting residues in c-MYB and those at equivalent positions in MYB.Ph3 are boxed. In c-MYB, strong interactions are represented by continuous lines, and weak interactions are represented by broken lines. In MYB.Ph3, the classification of interactions has not been attempted. The hydrophobic cavity formed by -2C, -1'C, and +1'C in MBSI and by -2T, -1T, and +1'C in MBSII is highlighted by a line surrounding these bases. The Gs at positions -2' and -4 in MBSI and MBSII, respectively, where the Lys67 of MYB.Ph3 is proposed to establish the alternative contacts, are circled.
[View Larger Version of this Image (70K GIF file)]


Major Role of Residue Leu71 from the R2 Repeat in Dual Recognition by MYB.Ph3

Recognition determinants within the R2 repeat of MBSI and/or MBSII were investigated using chimeric proteins obtained by full or partial replacement of the recognition helix of the A2 repeat from protein A2M3 by its M2 counterpart, because repeats A2 and M2 determined the most extreme differences in binding to types I and II sequences (Fig. 2). As shown in Fig. 4, full substitution of the recognition helix (A2M3-3 protein) conferred the c-MYB specificity, whereas the partial substitution in A2M3-2 did not greatly alter the A2M3 specificity. This suggested that the N-terminal half of the R2 recognition helix was the major determinant of binding to MBSI or MBSII in the Am305/c-MYB context.


Fig. 4. Residues of the R2 repeat influencing the MBSI/MBSII binding specificity as detected from the analysis of intra R2 chimeric proteins. a, alignment of the amino acids of the C-terminal part of the R2 MYB repeats of MYB.Ph3, Am305, and c-MYB proteins. All of the amino acid differences are highlighted (full boxes). Arrows indicate the part of the recognition helix substituted in the corresponding chimera (A2M3-2 and A2M3-3). b, SDS-PAGE analysis of the chimeric and original proteins used. c, retarded bands of band-shift experiments using the four types of binding sites; the proteins are shown schematically on the left.
[View Larger Version of this Image (44K GIF file)]


To confirm the importance of amino acid residues of the N-terminal part of the R2 recognition helix in the MYB.Ph3 context, we performed EMSA with mutant derivatives of the MYB.Ph3 and c-MYB proteins affecting the three nonconserved positions in these two proteins (Fig. 5). Each replacement had a different effect on the DNA binding properties of MYB.Ph3 and of c-MYB. Remarkably, a single residue substitution in MYB.Ph3 (Leu71 right-arrow Glu) conferred c-MYB specificity (with respect to the targets used), and the reciprocal change in c-MYB (Glu132 right-arrow Leu) showed the reverse behavior, because it rendered a c-MYB derivative able to bind to MBSII sequences, although showing a slight preference for MBSI sequences. An additional change in c-MYB (Gln-Glu right-arrow Ser-Leu) resulted in a preference of this mutant protein for MBSII. It is noteworthy that the MYB.Ph3 (Cys-Ser right-arrow Ile-Gln) also showed preference for MBSI sequences. In fact, its relative affinity to MBSI compared to MBSII was higher than that of the corresponding mutant of c-MYB, c-MYB (Glu right-arrow Leu), indicating that the effect of this residue is dependent on protein context. A similar conclusion can be drawn by comparing the DNA binding properties of MYB.Ph3 (Ser-Leu right-arrow Gln-Glu) and MYB.Ph3 (Cys-Ser-Leu right-arrow Ile-Gln-Glu) with those of c-MYB, or of MYB.Ph3 (Cys right-arrow Ile) with those of c-MYB (Gln-Glu right-arrow Ser-Leu).


Fig. 5. Residues of the R2 repeat influencing the MBSI/MBSII specificity as detected from the analysis of site-specific MYB mutants. a, alignment of the amino acids of the C-terminal part of the R2 MYB repeat of MYB.Ph3 and c-MYB proteins. The residues that were substituted in any of the constructs are highlighted (full boxes). b, SDS-PAGE analysis of the mutants and original proteins used. c, EMSA using the four types of binding sites (Fig. 1); the proteins are shown schematically on the left. Here, in each protein, the residue present at each of three positions where the mutants may differ from their wild-type protein is indicated: empty letter indicates wild type residue; boldface, mutant residue.
[View Larger Version of this Image (34K GIF file)]


Mutational Analysis of the Conserved Lys Residue in the R2 Repeat

A major difference between the two types of MYB.Ph3 binding site is the greater sequence constraints on MBSII compared to MBSI at positions -4 and -3 imposed by the presence of T instead of C at position -2 (33). Thus, for instance, exchanging T in MBSII with C has only a moderate effect on its binding by MYB.Ph3, whereas the reciprocal change in MBSI (i.e. C right-arrow T), results in a great impairment of binding (Ref. 33; Fig. 6). Molecular modeling predicted that the highly conserved Lys residue from MYB.Ph3 (Lys67) should interact with -2'G in MBSI, like the equivalent Lys of c-MYB, but with -4G (and perhaps with -3T) in the MBSII sequence (see "Discussion" and Fig. 7); thus, the Lys67 residue may be responsible for these sequence constraints. To test this prediction, the effect of mutating the Lys67 residue (to Ala or Ser) on DNA binding specificity was examined. As shown in Fig. 6, the two mutants had reduced DNA binding affinity but bound better to MBSI and MBSII than to MBSIG, MBSIA (AAAAGGTTA), and MBSIIG, indicating that the mutations did not affect MYB.Ph3 specificity indiscriminately. However, in sharp contrast to wild-type MYB.Ph3, these mutant proteins bound similarly to MBSI and MBSIT (AAATGGTTA), in agreement with the prediction that the Lys67 residue is responsible for sequence constraints at positions -3 and -4 in MBSII. In fact, when Lys67 does not impose constraints on positions -3 and -4 (e.g. when MYB.Ph3 binds to MBSI, or when MYB.Ph3 (Lys67 right-arrow Ala/Ser) binds to MBSI or MBSII), it appears that the preferred base at these positions is an A, as in MBSI. It is also noteworthy that the Lys right-arrow Ala/Ser mutations showed higher overall affinity to MBSIT than MYB.Ph3. This could indicate that when the (large and charged) Lys residue of MYB.Ph3 does not establish a base-specific contact, it may perturb base contacts by other residues.


Fig. 6. The role of the conserved Lys67 residue in MYB.Ph3 dual DNA binding specificity as detected by mutational analysis. a, alignment of the amino acids of the C-terminal part of the R2 repeat of MYB.Ph3 and c-MYB proteins. The mutated residue (Lys67) is highlighted (full box). b, SDS-PAGE analysis of the mutants and original proteins used. c, EMSA using six types of binding sites; the proteins are shown schematically on the left. The sequence of the upper strand of the two new oligonucleotides is: MBSIT(IT), AATGGTTA; and MBSIA (IA), AAAGGTTA.
[View Larger Version of this Image (32K GIF file)]



DISCUSSION

DNA binding studies with hybrid MYB proteins and with site-directed MYB mutants reported here indicate that the R3 repeat is the most responsible for differential binding to MBSI and MBSII compared to MBSIG and MBSIIG, whereas the R2 repeat was primarily involved in determining the MBSI/MBSII specificity (Figs. 2, 3, 4, 5). Additionally, these experiments indicated that both repeats influence each other's primary effect; for instance, the higher relative affinity for MBSIIG (versus MBSII) of the A2P3 chimera with respect to that of P2P3 reveals a role of R2 (A2) in the interaction with the 3' half of the sequence (Fig. 2).

The proposed primary roles of the R2 and R3 repeats and their functional interdependence are in good agreement with the available structural information on c-MYB. Indeed, the NMR solution structure of the complex between the c-MYB R2R3 domain and DNA shows that its repeats physically interact and bind to DNA in a partially overlapping way (12). In this context, it is not surprising that the R2 and R3 amino acid sequences of both Am305 and MYB.Ph3 fit well in the structure of the R2R3 domain from c-MYB (data not shown). This structural similarity is also manifest in the effects of particular residue substitutions, like that of Asn125 right-arrow Arg in MYB.Ph3, which resulted in a specificity change at position +2 (Fig. 3), the position interacting with the equivalent residue from c-MYB, Asn186 (Ref. 12; Fig. 7).

The physical interactions between repeats of c-MYB bound to DNA result in (intramolecular) cooperativity (12). In this scenario, it is conceivable that for a MYB domain to be functional, its R2 and R3 repeats must adapt to each other. Our results with R2/R3 MYB chimeras are in line with this suggestion because most of them displayed reduced DNA binding affinity with respect to their progenitors (most notably P2A3). Thus, it appears that co-evolution may have placed constraints on the compatibility between repeats from different MYB proteins.

Leu71, a Key Residue for MYB.Ph3 Dual DNA Binding Specificity

In this study, we have found that a single residue substitution within the recognition helix of the R2 repeat of MYB.Ph3, Leu71 right-arrow Glu, switches the dual DNA binding specificity of MYB.Ph3 to the c-MYB specificity, and that the reciprocal (Glu132 right-arrow Leu) change in c-MYB essentially confers the MYB.Ph3 specificity. As discussed below, such drastic effects on specificity caused by the Leu right-arrow Glu substitution most likely indicate that there are key residues that influence binding specificity not only directly, through base contacts, but remarkably also indirectly, through interactions with other base-contacting residues, and that a single residue can establish alternative base contacts in different targets.

In the NMR average structure of the complex of the c-MYB minimal DNA-binding domain (R2R3) with its target DNA (GTCAGTTA; Ref. 12), Glu132 interacts weakly with DNA (positions -2C and +1'C in our nomenclature, see Figs. 1 and 7; Ref. 12). Additionally, Glu132 establishes an electrostatic interaction with Lys128, a key base-contacting residue that interacts with -2'G. Using molecular modeling (see "Materials and Methods"), we predicted that the change of Glu for Leu would have two consequences: (i) Leu would be in a hydrophobic cavity adequate to allow interaction with C or T at position -2 (see also Ref. 6), the bases respectively present in MBSI and MBSII, whereas Glu would not specifically interact with T, and (ii) the electrostatic interaction between Lys128 of c-MYB (Lys67 in MYB.Ph3) and Glu does not occur, thereby facilitating the interaction of this Lys residue with an alternative position, -4 and to some extent with -3. This is particularly important in the binding to MBSII, which has A rather than G at position -2'. The possibility that a single residue can establish contacts at two alternative positions has been documented/invoked in several instances (44-46).

Further evidence that Lys128 (c-MYB coordinates) can establish alternative base contacts was obtained from the analysis of the 25 available NMR solution structures of the c-MYB(R2R3)· DNA complex. Indeed, in two solutions Lys128 was found to interact directly with -4G, whereas Glu132 was located far away from the average structure and did not interact directly with the DNA (see alternate positions of Lys128 and Glu132 in Fig. 7). Hence, the preferential interaction of c-MYB residue Lys128 with -2G could be due to the electrostatic attraction of Lys128 toward Glu132 (as close as 2.64 Å in some of the NMR solutions).

These interpretations are also supported by our results with the Lys67 right-arrow Ala/Ser substitutions in MYB.Ph3, which broadened specificity at positions -4 and -3 (Fig. 6), and with missing nucleoside assays, which have shown that nucleoside at position -2' (A) in MBSII is fully dispensable in binding by MYB.Ph3 (33).The requirement for T at position -3 in MBSII (33) is not well understood, although it could reflect that the methyl group of T pushes Lys67 to -4G, or alternatively that Lys67 interacts with GT rather than only with G (see Ref. 6).

MYB DNA Binding Specificity Is Also Influenced by Protein Context

Base-contacting residues play a critical and direct role in determining the specificity of DNA-binding proteins. This is evident from the fact that specificity can be explained to a significant extent using simple rules: the base-contacting specificity of different residues and the (usually) fixed position of base-contacting residues within the DNA-binding domain in each protein family (6, 7, 47). For instance, in MYB.Ph3, the effect of the Asn125 right-arrow Arg substitution does conform to these rules. However, there is strong evidence that binding specificity in MYB proteins can also be indirectly modulated by non-base and base-contacting residues. Thus, Am305 shares all putative base-contacting residues with MYB.Ph3 (Asn125 right-arrow Arg), but only the latter strongly binds to MBSI (Figs. 3, 4, and 7). Likewise, the maize P protein shares all the putative recognition residues with MYB.Ph3 and/or c-MYB, but it binds to a different site (GGT(T/A)GGT(A/G); Refs. 30 and 35). Moreover, in addition to the indirect effects of the Leu/Glu substitutions discussed above, we have also shown that several substitutions in presumably non-base-contacting residues alter specificity and/or affinity (see Fig. 5), and in some instances (e.g. the Gln/Ser substitution) the degree of the effect on specificity was different in the MYB.Ph3 and in the c-MYB contexts.

One possible explanation for these indirect effects on binding specificity could be that residue substitutions affect conformational properties of the protein, thereby influencing the strength of possible contacts by recognition residues or imposing constraints in the structural properties of the DNA (3), because some MYB proteins induce bending/distortions upon binding to DNA (38, 48). In this regard, note the presumed structural flexibility of the R2-recognition helix, a property expected to be very sensitive to mutations (14, 15, 17, 18). Some of the specificity effects of substitutions of non-base-contacting residues could simply be mediated by side-chain interactions with base-contacting residues, such as that of Glu132 with Lys128 in c-MYB. For instance, residue Gln129 (c-MYB coordinates) interacts in the average structure with the phosphate backbone of the DNA, but in some of the solutions, it interacts with Glu132 or with Lys128. In solutions where Gln129 interacts with Glu132 or with Lys128, Lys128 interacts very closely with -2'G (see Fig. 7). Hence, it seems that Gln129 contributes to maintain Lys128 in the conformation that favors the interaction with -2'G, and such effect could be accentuated when Glu132 is missing (c-Myb (Glu132 right-arrow Leu) and MYB.Ph3 (Cys-Ser right-arrow Ile-Gln); Fig. 5).

The differential effects of some residue substitutions, such as Gln/Ser, in MYB.Ph3 and c-MYB further underline the importance of protein context in MYB DNA binding specificity, possibly involving interresidue interactions. Indeed, we noticed that, in c-MYB, Glu132 and Gln129 are part of a network with several residues (Asn179, Lys182, Asp178, Arg131, and His135), phosphates, and bases. In MYB.Ph3, one of these residues, His135, is substituted by Ala (Fig. 7), and consequently, the effect of substitutions involving residues at positions 129 and 132 (c-MYB coordinates) cannot be the same in the two protein contexts.

The notion that DNA binding specificity is best viewed as the result of a network of interactions of residue side chains with the DNA backbone and bases, as well as with other residues, rather than the simple and independent contribution of base-contacting residues has also been highlighted for other protein families, such as bZIP, ribbon-helix-helix, homeodomain, prokaryotic helix-turn-helix, and others (for examples, see Refs. 44-52). Obviously, the importance of interresidue interactions will be higher in proteins with physically interacting DNA-binding subdomains, such as the MYB and cut-homeodomain proteins (12, 53).

The numbers of myb genes in plant species are large, in contrast to those in other types of eukaryotes; for instance, there are at least 20-30 myb genes in Petunia (21), and Arabidopsis contains over 100 of these genes.2 However, 6 of 8 putative recognition residues are fully conserved among all plant MYB proteins with known sequence (30 in data bases; data not shown), and the remaining 2 residues are conserved in at least 80% of the proteins. Therefore, mutations in non-base-contacting residues must have greatly contributed to the generation of functional diversity among the members of the plant MYB family.


FOOTNOTES

*   This research was supported by the European Community (contract-BIO2-CT93-0101). The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Dagger    Recipient of a predoctoral fellowship from the Comunidad Autónoma de Madrid. Present address: Dept. of Biology, Plant Science Institute, University of Pennsylvania, Philadelphia, PA 19104-6018.
§   To whom correspondence should be addressed. Tel.: 341-5854504; Fax: 341-5854506; E-mail: jpazares{at}samba.cnb.uam.es.
1    The abbreviations used are: MBSI and MBSII, Myb.Ph3 binding sites, types I and II, respectively; PCR, polymerase chain reaction; PAGE, polyacrylamide gel electrophoresis; EMSA, electrophoretic mobility shift assay.
2    I. Romero, A. Fuertes, M. J. Benito, A. Leyva, and J. Paz-Ares, manuscript in preparation.

Acknowledgments

We are very grateful to Drs. Cathie Martin and Roger Watson for providing us with the Am305 and c-MYB progenitor constructs, respectively. We thank Drs. Francisco García-Olmedo, Cathie Martin, Joseph Ecker, Darío García de Viedma, and Antonio Leyva for critical reading of the manuscript. The excellent technical assistance of María Jesús Benito is gratefully acknowledged.


Note Added in Proof

Additional information can be obtained from the authors at the following WWW site: http://gredos.cnb.uam.es/sanchez/Myb.html.


REFERENCES

  1. Guarente, L., and Bermingham-McDonogh, O. (1992) Trends Genet. 8, 27-32 [CrossRef][Medline] [Order article via Infotrieve]
  2. Katagiri, F., and Chua, N. H. (1992) Trends Genet. 8, 22-27 [CrossRef][Medline] [Order article via Infotrieve]
  3. Pabo, C. O., and Sauer, R. T. (1992) Annu. Rev. Biochem. 61, 1053-1095 [CrossRef][Medline] [Order article via Infotrieve]
  4. Mitchel, P. J., and Tjian, R. (1989) Science 245, 371-378 [Medline] [Order article via Infotrieve]
  5. Struhl, K. (1989) Trends Biochem. Sci. 14, 137-140 [CrossRef][Medline] [Order article via Infotrieve]
  6. Suzuki, M., Brenner, S. E., Gerstein, M., and Yagi, N. (1995) Protein Eng. 8, 319-328 [Medline] [Order article via Infotrieve]
  7. Suzuki, M., Yagi, N., and Gerstein, M. (1995) Protein Eng. 8, 329-338 [Abstract]
  8. Lüscher, B., and Eisenman, R. N. (1990) Genes Dev. 4, 2235-2241 [CrossRef][Medline] [Order article via Infotrieve]
  9. Graf, T. (1992) Curr. Opin. Genet. Dev. 2, 249-255 [Medline] [Order article via Infotrieve]
  10. Thompson, M. A., and Ramsay, R. G. (1995) BioEssays 17, 341-350 [Medline] [Order article via Infotrieve]
  11. Ogata, K., Hojo, H., Aimoto, S., Nakai, T., Nakamura, H., Sarai, A., Ishii, S., and Nishimura, Y. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 6428-6432 [Abstract]
  12. Ogata, K., Morikawa, S., Nakamura, H., Sekikawa, A., Inoue, T., Kanai, H., Sarai, A., Ishii, S., and Nishimura, Y. (1994) Cell 79, 639-648 [Medline] [Order article via Infotrieve]
  13. Ogata, K., Morikawa, S., Nakamura, H., Hojo, H., Yoshimuri, S., Zhang, R., Aimoto, Y., Hirata, Z., Sarai, A., Ishii, S., and Nishimura, Y. (1995) Nat. Struct. Biol. 2, 309-320 [Medline] [Order article via Infotrieve]
  14. Ogata, K., Kanei-Ishii, C., Sasaki, M., Hatanaka, H., Nagadoi, A., Enari, M., Nakamura, H., Nishimura, Y., Ishii, S., and Sarai, A. (1996) Nat. Struct. Biol. 3, 178-187 [Medline] [Order article via Infotrieve]
  15. Jamin, N., Gabrielsen, O. S., Gilles, N., Lirsac, P. N., and Toma, F. (1993) Eur. J. Biochem. 216, 147-154 [Abstract]
  16. Frampton, J., Gibson, T. J., Ness, S. A., Döderlein, G., and Graf, T. (1991) Protein Eng. 4, 891-901 [Abstract]
  17. Myrset, A. H., Bostad, A., Jamin, N., Lirsac, P. N., Toma, F., and Gabrielsen, O. S. (1993) EMBO J. 12, 4625-4633 [Abstract]
  18. Carr, M. D., Wollborn, U., McIntosh, P. B., Frenkiel, T. A., McCormick, J. E., Bauer, C. J., Klempnauer, K. H., and Feeney, J. (1996) Eur. J. Biochem. 235, 721-735 [Abstract]
  19. Anton, I. A., and Frampton, J. (1988) Nature 336, 719 [Medline] [Order article via Infotrieve]
  20. Saikumar, P., Murali, R., and Reddy, E. P. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 8452-8456 [Abstract]
  21. Avila, J., Nieto, C., Cañas, L. A., and Paz-Ares, J. (1993) Plant J. 3, 553-562 [CrossRef][Medline] [Order article via Infotrieve]
  22. Howe, K. M., Reakes, C. F. L., and Watson, R. J. (1990) EMBO J. 9, 161-169 [Abstract]
  23. Gabrielsen, O. S., Sentenac, A., and Fromageot, P. (1991) Science 253, 1140-1143 [Medline] [Order article via Infotrieve]
  24. Tanikawa, J., Yasukawa, T., Enari, M., Ogata, K., Nishimura, Y., Ishii, S., and Sarai, A. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 9320-9324 [Abstract]
  25. Li, S. F., and Parish, R. W. (1995) Plant J. 8, 963-972 [Medline] [Order article via Infotrieve]
  26. Biedenkapp, H., Borgmeyer, U., Sippel, A. E., and Klempnauer, K. H. (1988) Nature 335, 835-837 [CrossRef][Medline] [Order article via Infotrieve]
  27. Stober-Grässer, U., Brydolf, B., Bin, X., Grässer, F., Firtel, R. A., and Lipsick, J. S. (1992) Oncogene 7, 589-596 [Medline] [Order article via Infotrieve]
  28. Weston, K. (1992) Nucleic Acids Res. 20, 3043-3049 [Abstract]
  29. Urao, T., Yamaguchi-Shinozaki, K., Urao, S., and Shinozaki, K. (1993) Plant Cell 5, 1529-1539 [Abstract/Free Full Text]
  30. Grotewold, E., Drummond, B. J., Bowen, B., and Peterson, T. (1994) Cell 76, 543-553 [Medline] [Order article via Infotrieve]
  31. Sablowski, R. W. M., Moyano, E., Culiañez-Macia, F. A., Schuch, W., Martin, C., and Bevan, M. (1994) EMBO J. 13, 128-137 [Abstract]
  32. Gubler, F., Kalla, R., Roberts, J. K., and Jacobsen, J. V. (1995) Plant Cell 7, 1879-1891 [Abstract/Free Full Text]
  33. Solano, R., Nieto, C., Avila, J., Cañas, L., Díaz, I., and Paz-Ares, J. (1995) EMBO J. 14, 1773-1784 [Abstract]
  34. Paz-Ares, J., Ghosal, D., Wienand, U., Peterson, P. A., and Saedler, H. (1987) EMBO J. 6, 3553-3558 [Abstract]
  35. Grotewold, E., Athma, P., and Peterson, T. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 4587-4591 [Abstract]
  36. Cone, K. C., Cocciolone, S. M., Burr, F. A., and Burr, B. (1993) Plant Cell 5, 1795-1805 [Abstract/Free Full Text]
  37. Franken, P., Schrell, S., Peterson, P. A., Saedler, H., and Wienand, U. (1994) Plant J. 6, 21-30 [CrossRef][Medline] [Order article via Infotrieve]
  38. Solano, R., Nieto, C., and Paz-Ares, J. (1995) Plant J. 8, 673-682 [CrossRef][Medline] [Order article via Infotrieve]
  39. Cormack, B. (1992) in Current Protocols in Molecular Biology (Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Smith, J. A., Seidman, J. G., and Struhl, K., eds.) Vol. 1, Unit 8.5, Wiley-Interscience, New York
  40. Abola, E. E., Bernstein, F. C., and Koetzle, T. F. (1988) in Computational Molecular Biology: Sources and Methods for Sequence Analysis (Lesk, A. M., ed), pp. 69-81, Oxford University Press, Oxford, United Kingdom
  41. Vriend, G. (1990) J. Mol. Graphics 8, 52-56 [CrossRef][Medline] [Order article via Infotrieve]
  42. Vriend, G., and Sander, C. (1993) J. Appl. Cryst. 26, 47-60
  43. Holm, L., and Sander, C. (1992) J. Mol. Biol. 225, 93-105 [Medline] [Order article via Infotrieve]
  44. Kim, J., Tzamarias, D., Ellenberger, T., Harrison, S. C., and Struhl, K. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 4513-4517 [Abstract]
  45. Smith, D. L., and Johnson, A. D. (1994) EMBO J. 13, 2378-2387 [Abstract]
  46. Raumann, B. E., Knight, K. L., and Sauer, R. T. (1995) Nat. Struct. Biol. 2, 1115-1122 [Medline] [Order article via Infotrieve]
  47. Suzuki, M. (1994) Structure 2, 317-326 [Medline] [Order article via Infotrieve]
  48. Saikumar, P., Gabriel, J. L., and Reddy, E. P. (1994) Oncogene 9, 1279-1287 [Medline] [Order article via Infotrieve]
  49. Benevides, J. M., Weiss, M. A., and Thomas, G. J. J. (1994) J. Biol. Chem. 269, 10869-10878 [Abstract/Free Full Text]
  50. Rodgers, D. W., and Harrison, S. C. (1993) Structure 1, 227-240 [Medline] [Order article via Infotrieve]
  51. Suckow, M., Madan, A., Kisters-Woike, B., von Wilcken-Bergmann, B., and Müller-Hill, B. (1994) Nucleic Acids Res. 22, 2198-2208 [Abstract]
  52. Schwabe, J. W. R., Chapman, L., and Rhodes, D. (1995) Structure 3, 201-213 [Medline] [Order article via Infotrieve]
  53. Andrés, V., Chiara, M. D., and Mahdavi, V. (1993) Genes Dev. 8, 245-257 [Abstract]

©1997 by The American Society for Biochemistry and Molecular Biology, Inc.