(Received for publication, August 13, 1996, and in revised form, November 15, 1996)
From the Department of Chemistry, Yale University, New Haven, Connecticut 06520-8107
The carboxyl terminus of transcription factor Sp1
contains three contiguous Cys2-His2 zinc finger
domains with the consensus sequence
Cys-X2-4-Cys-X12-His-X3-His.
We have used standard homonuclear two-dimensional NMR techniques to
solve the solution structures of synthetic peptides corresponding to
the last two zinc finger domains (Sp1f2 and Sp1f3,
respectively) of Sp1. Our studies indicate a classical
Cys2-His2 type fold for both the domains
differing from each other primarily in the conformation of
Cys-X2-Cys (-type I turn) and
Cys-X4-Cys (
-type II turn) elements. There
are, however, no significant differences in the metal binding
properties between the Cys-X4-Cys (Sp1f2)
and Cys-X2-Cys (Sp1f3) subclasses of zinc
fingers.
The free solution structures of Sp1f2 and Sp1f3 are very similar to those of the analogous fingers of Zif268 bound to DNA. There is NMR spectral evidence suggesting that the Arg-Asp buttressing interaction observed in the Zif-268·DNA complex is also preserved in unbound Sp1f2 and Sp1f3. Modeling Sp1-DNA complex by overlaying the Sp1f2 and Sp1f3 structures on Zif268 fingers 1 and 2, respectively, predicts the role of key amino acid residues, the interference/protection data, and supports the model of Sp1-DNA interaction proposed earlier.
Synthesis of mRNA by RNA polymerase II requires the interaction of a large array of auxiliary transcription factors that recognize and bind to specific promoter DNA sequences located upstream of eukaryotic genes (1, 2). These transcription factors regulate the initiation of transcription in a temporally ordered manner by assembling and engaging the active transcription complex. Consequently, many of the transcription factors have multiple domains responsible for sequence-specific DNA binding and transcriptional activation.
In order to understand the detailed roles played by each of the domains
of sequence-specific transcription factors, efforts were made to
fractionate the factors necessary to reconstitute transcriptional
activity in vitro (3-5). These experiments resulted in the
identification of one such promoter-specific transcription factor, Sp1,
from HeLa cells (6-10). Sp1 enhances transcription from a variety of
viral and cellular genes by binding to GC-rich decanucleotide
recognition elements (GC boxes) within the 5-flanking promoter
sequences (10, 11). Although Sp1 can bind and activate transcription
from a single GC box sequence (12), Sp1 binding sites often occur as
multiple repeats (6, 9, 10, 13). However, Sp1 binds independently to
each GC box sequence; physical interaction between adjacent Sp1
molecules is insufficient to give rise to cooperative DNA binding
behavior (13). The multi-domain nature of Sp1 also facilitates Sp1-Sp1
interactions that occur in cases where Sp1 binding sites are widely
separated (14-16). This self-association and DNA looping phenomenon
are proposed to give rise to the observed transcriptional synergism or
super-activation of Sp1 (14, 17).
The construction of truncated Sp1 fragments allowed the localization of
the Zn2+-dependent DNA binding region to the
carboxyl terminus, which was shown by sequence analysis to contain
three "zinc finger" domains (7, 18-20). The Zn2+
domains found in Sp1 are analogous to those first identified in TFIIIA
(21) and adopt the consensus sequence
(FYH)XCX2-4CX3FX5LX2HX3HX5 (metal binding residues in bold) (22, 23). These domains are distinct
from the Cys-rich motifs found in the steroid receptors (24), the yeast
transcription factor GAL4 (25), or the Cys2-His-Cys motif
observed in retroviral proteins (26). Structural modeling (27) and
structural studies (28-35) show that Cys2-His2
domains contain two -strands with the Cys residues located at the
-turn, and an
-helix containing the two His residues, oriented to
coordinate Zn2+ in a tetrahedral fashion. This structural
unit is now regarded as one of the major structural motifs involved in
sequence-specific DNA binding and eukaryotic gene regulation.
Our general goal is to understand at the molecular level how the three
zinc finger domains of Sp1 can bind with high affinity to a variety of
GC box DNA sequences (10, 11). We have previously described the
overexpression, purification, and characterization of a 92-amino acid
peptide, Sp1-Zn92, that contains the three zinc fingers of Sp1 (36).
The DNA binding properties of Sp1-Zn92 were surveyed using a variety of
techniques based on gel electrophoresis to quantitatively analyze its
interaction with several native and modified DNA sites. Sp1-Zn92 was
shown to mimic the DNA binding properties of native Sp1 and, through
comparisons with results from other zinc finger systems, a model was
developed to explain the distinctive DNA binding properties of Sp1. Our
model serves as the starting point for detailed studies of the
conformations of the individual domains of Sp1-Zn92 aimed at defining
those molecular features that allow Sp1 to recognize promoter sequences that contain the asymmetric GGGCGG hexanucleotide core (GC box) with a
consensus sequence of 5-(G/T)GGGCGG(G/A)(G/A)(G/T)-3
.
The distinguishing feature of Sp1-DNA binding is the high degree of sequence variability that is tolerated within the GC box with retention of high binding affinity (10, 11, 37). This raises interesting questions regarding detailed molecular mechanism of Sp1-DNA recognition process and the role which the individual fingers may have to play as "flexible-independent" reading domains in modulating binding to nonidentical DNA sites with near equal affinity. As a first step toward studying Sp1-DNA interactions, we have determined the solution structures of synthetic peptides corresponding to zinc finger domains two and three (N terminus to C terminus) of Sp1 using standard homonuclear NMR techniques. While zinc finger 3 belongs to the well defined Cys-X2-Cys structural subclass, zinc finger 2 is a member of the Cys-X4-Cys structural subclass, which has been defined for relatively few systems (32, 33, 35). The refined solution structures of zinc fingers 2 and 3 are compared with each other and with other reported zinc finger structures. Putative DNA binding residues are identified, and the individual roles of conserved residues are analyzed in the context of our previous model (36) of Sp1-DNA interactions.
Peptides corresponding to zinc finger
domains 2 and 3 of Sp1 (Sp1f21 and
Sp1f3, respectively, sequences shown in Fig. 3) were prepared at
the Peptide Synthesis Facility, Keck Foundation Biotechnology Resource
Laboratory, Yale University, using solid phase
N-tert-butyloxycarbonyl chemistry. Amino acids were coupled
by activated esters, and final deprotection/cleavage was done using
hydrogen fluoride. Purification of the peptides was performed using
Vydac C18 reverse phase columns and a linear gradient from 0.05%
trifluoroacetic acid to 80% acetonitrile, 0.05% trifluoroacetic acid.
The final product was lyophilized and characterized by analytical
reverse phase high performance liquid chromatography, amino acid
analysis, and laser desorption mass spectrometry. Peptide samples for
all studies were stored under an argon atmosphere.
Metal Binding Studies
The affinities of peptides Sp1f2 and Sp1f3 for Co2+ ion were determined using absorption spectroscopy (Perkin-Elmer Lambda 6 UV/VIS spectrophotometer) as a function of CoCl2 (99.999%, Aldrich) concentration in 50 mM HEPES, 50 mM NaCl, pH 8.0. All buffers were degassed using several freeze-pump-thaw cycles before use. The Co2+ dissociation constants were calculated using Equation 1 (38):
![]() |
(Eq. 1) |
CD spectra of Sp1f2 and Sp1f3 (100 µg/ml peptide, 5 mM Tris·HCl, pH 8.0, 8 °C) were measured using an Aviv model 60DS spectropolarimeter. Spectra were recorded from 300 to 190 nm and averaged over five scans with bandwidth of 1.50 nm, scan step of 1.00 nm/point, and an averaging time of 10.0 s.
NMR Sample PreparationSp1f2 and Sp1f3 were dissolved in 0.5 ml of 25 mM Tris-d11 (Cambridge Isotope Laboratories), pH 7.5, containing 0.2% sodium azide (w/v) and 10% D2O (v/v) followed by the addition of ZnSO4 (20% molar excess). The final pH of the sample was adjusted to 5.90 (meter reading, uncorrected for isotope effect). The peptide concentrations were approximately 5 mM for each sample. All solutions were degassed by three freeze-pump-thaw cycles prior to protein dissolution, and all manipulations were carried out under an argon atmosphere. Samples were stored under an argon atmosphere and showed no degradation over the length of NMR experiments.
NMR MethodsNMR spectra were acquired using either a Bruker AM500 or a GE Omega500 NMR spectrometer. Two-dimensional NOESY2 spectra were acquired with selective water presaturation (delays alternating with nutations for tailored excitation pulse (39), AM500; Shinnar-LeRoux pulse (40), Omega500) followed by the standard NOESY pulse train (41). An inversion pulse bracketed by homospoil pulses was used during mixing time to minimize artifacts from the residual water resonance. Double quantum filtered correlation spectroscopy spectra were acquired with optimized phase cycling (42). Clean TOCSY spectra (43) were acquired using water saturation as given above, with MLEV17 (44) (AM500, Omega500) or decoupling in the presence of scalar interactions-2 (45) (Omega500) mixing schemes, followed by flipback and homospoil pulses for elimination of the rotating frame Overhauser effect and for water suppression. Quadrature detection in the indirect dimension was obtained using either time-proportional phase incrementation (AM500) or States-time-proportional phase incrementation (Omega500) (46, 47). Spectra were typically acquired with 32 or 48 scans per t1 value for 1024 t1 values, spectral width was typically 6000 Hz, and 2048 complex points were collected in the direct dimension. The free induction decay in both dimensions were multiplied by phase-shifted sine bell apodization function, zero-filled, and Fourier-transformed to yield 2048 by 2048 matrices. All spectra were processed using the FELIX 2.30 software package (Biosym, Inc.).
Structure CalculationsThe hybrid distance geometry dynamical simulated annealing protocol within X-PLOR software package (48) was used for structure calculations. Interproton distances were calculated from cross-peak volumes derived from two-dimensional NOESY spectra recorded with a 200-ms mixing time, using a NH-NH (i, i + 1) distance of 2.8 Å as an internal standard. Upper and lower bounds were set equal to ±6% of the square of the distance calculated. For non-stereospecific assignments, distance constraints were applied to a pseudo atom situated at the geometric center of the nuclei, and the distance bounds were appropriately expanded, based on the known amino acid geometries. The experimental constraints were represented in the form of an asymmetric internuclear pseudo energy, having a minimum at the distance constraint, an infinite harmonic wall to the lower bound side, and a harmonic function making a transition to a zero slope asymptote to the upper bound side. The soft square potential function used in these calculations had a maximum potential of 50 kcal/mol, a soft square exponent of 2, and a scaling factor of 25.
3JHNH coupling constants for Sp1f2
were determined from a double quantum filtered correlation spectroscopy
spectrum using absorptive and dispersive antiphase splittings (49). The
coupling constants were converted to torsional angle constraints using the Karplus relationship, and these restraints were used during the
refinement step of Sp1f2 structure calculation protocol.
The structure calculations for Sp1f2 and Sp1f3 can be
divided into three steps. In the first step, a set of substructures containing the backbone, C and C
atoms
were embedded in Cartesian coordinate space using the distance geometry
protocol. Next, the remaining atoms were added in an extended
conformation and subjected to multiple rounds of simulated annealing.
Finally, the distance geometry simulated annealing regularized
structures were subjected to multiple rounds of simulated annealing
refinement. All calculations were performed using a SGI R4000
workstation.
Initial structure calculations were
performed without incorporating the Zn2+ atom; inspection
of these structures clearly identified the
Cys2-His2 Zn2+ binding ligands.
Furthermore, analysis of the data allowed the unambiguous assignment of
the N atoms of His residues as the heteroatom
coordinating the metal. Thereafter, Zn2+ was incorporated
in structure calculations with an approximately tetrahedral geometry.
Zinc-ligand bonds were assigned equilibrium distances of 2.30 Å [Zn-S] and 2.00 Å [Zn-N] (50) using artificial NOE constraints
with a high weighing factor (300). The angles centered on the metal
were constrained with tetrahedral equilibrium (harmonic potential). The
N
atom of His ligands were constrained to lie in the
plane defined by C
2, C
1, and
Zn2+ atoms.
Final structures were subjected to additional rounds of energy minimization, once after removing any lower bound on distance constraints and once without any explicit metal geometry constraints. Energy minimization without lower bounds had negligible effect on the average geometry of the calculated structures and did not increase the RMSD within the final structures for both Sp1f2 and Sp1f3. The energy minimization without metal binding constraints preserved the configuration and geometry of the Cys and His ligands around the metal binding site showing that metal-ligand constraints were consistent with the global energy minimum for the Sp1f2 and Sp1f3 structures.
The average coordinates and the RMSD values were calculated within
X-PLOR, and the family of structures was visualized and overlaid using
the software package Midas Plus Version 1.9. The structures with
averaged coordinates (Sp1f2·avg, Sp1f3·avg) were subjected to a final round of simulated annealing refinement
(refine·inp protocol of X-PLOR) to relieve bad contacts and irregular
covalent geometry which might have arisen due to geometrical averaging. These average-refined structures (Sp1f2·avg, Sp1f3·avg)
were used to identify potential hydrogen bonds and to model
Sp1f2-DNA and Sp1f3-DNA interactions. The criteria for
hydrogen bonds are that the distance between N of NH and O of CO
(H···
) be less than 3.4 Å and the
angle N-O-C be larger than 110° (51).
The
substitution of Co2+ for spectroscopically silent
Zn2+ is a well established technique to probe the
coordination environment of zinc containing metalloproteins (52-55).
The visible absorption spectra (Fig. 1) associated with
the d d transitions of Co2+-substituted Sp1f2 and
Sp1f3 show absorption maxima corresponding to two transitions
centered around 640 nm (Sp1f2,
M = 1080 M
1 cm
1) and 630 nm
(Sp1f3,
M = 1140 M
1
cm
1) with shoulders near 580 nm (Sp1f2,
M = 504 M
1 cm
1)
and 570 nm (Sp1f3,
M = 511 M
1 cm
1). These absorption
bands, which are responsible for the blue color of these
metallopeptides, are eliminated in the presence of stoichiometric
Zn2+, which readily displaces the coordinated
Co2+ ions.
Optical titration experiments allowed the determination of
Co2+ dissociation constants
(KDCo) for Sp1f2
(KDCo Sp1f2 = 1.2 × 106) and Sp1f3
(KDCo Sp1f3 = 2.1 × 10
6). These KD values are
consistent with values previously reported for zinc finger domains (62)
and indicate a folded conformation for Sp1f2 and Sp1f3 in
solution.
The positions and intensities of the d d transitions are consistent
with the formation of 1:1 Co2+-peptide complexes with the
metal centers occupying tetrahedral or distorted tetrahedral
environments for both Sp1f2 and Sp1f3 (54, 56, 57).
Observation of the ligand field bands are analogous to those reported
for Co2+ binding to the His2-Cys2
site in the second zinc finger domain of TFIIIA (56) and the gene 32 protein (58, 59). However, the definitive identification of
coordination geometry based on UV/VIS spectroscopy is difficult owing
to the similarity of the electronic spectra of distorted tetrahedral
and five-coordinate Co2+ complexes, especially when only
band positions are considered (53, 60, 61).
The circular dichroism spectra of both peptides show negative ellipticities at 228 nm, large negative molar ellipticities at 208 nm, and positive ellipticities at 190 nm (data not shown) in the presence of Zn2+. These features are consistent with the presence of regular secondary structure elements (63) and further indicate that the peptides adopt a conformation typical of folded zinc finger domains.
NMR Sequential Assignments and Secondary Structure DeterminationStandard procedures that utilize two-dimensional
COSY, TOCSY, and NOESY NMR data (64) were used to determine sequential resonance assignments (Figs. 2, A and
B). The short and medium range connectivity patterns and the
3JHNH coupling constants, summarized in Fig.
3, are consistent with an
-helical stretch for
residues 17-28 (Sp1f2) and residues 16-27 (Sp1f3).
Furthermore, the added presence of i, i + 2 connectivities indicates a
310 helical conformation for the last helical turn of both
zinc finger domains. This transition from an
-helix to a
310 helix is a general property shared by the
His-X3-His subclass of zinc fingers (structures
with His-X4-5-His spacing show no
indication of a 310 helix) (29). In both Sp1f2 and
Sp1f3, the helix terminates with the Gly residue two residues
after the second metal binding His. While the COOH-terminal portion of
both peptides produced NOE patterns characteristic of a classical
-helix, no long uninterrupted connectivity patterns characteristic
of a classical
-strand were observed in either peptide. However, short segments preceding the first Cys residue and closely succeeding the second Cys residue gave strong
C
Hi-NHi + 1 connectivities
characteristic of an extended strand conformation. The observed NOE
connectivities (Fig. 3) are also consistent with a turn among residues
Trp7-Cys10 (Sp1f2) and residues
Cys5-Cys8 (Sp1f3) connecting the two
extended strands in each zinc finger domain.
Three-dimensional Structures
The backbone conformations of
Sp1f2 and Sp1f3 families are well defined by the NMR data, as
shown in Fig. 4, A and B, and indicated by the average RMSDs for the peptides (Table
I). As expected, the
NH3+- and COO-terminal
residues are poorly defined. This is reflected by the low RMSD of 0.43 Å for the backbone from the first conserved hydrophobic residue (Phe)
through the residue immediately succeeding the second metal
coordinating His residue.
|
The calculated structures exhibit secondary structure elements
consistent with the short range NOE connectivity patterns. The overall
topology of both peptides conforms to the expected fold of
Cys2-His2 zinc finger domains consisting of two
antiparallel strands linked by a Cys-Cys loop followed by a reverse
90° turn and an -helix containing the two zinc coordinating His
residues.
Observed long range NOEs define the relative orientation of the
secondary structure elements. For Sp1f2 the NOE constraints are
consistent with an antiparallel orientation for the two -strands extending from the turn encompassing
Trp7-Cys10. Ninety percent of Sp1f2
structures are consistent with hydrogen bonds between
Phe3(CO) and Arg14(NH) (distance
H···
= 2.9 Å, angle N-O-C = 163°) and between Cys5(NH) and Lys12(CO)
(distance
H···
= 3.1 Å, angle
N-O-C = 160°), stabilizing the antiparallel
-sheet. The
remainder of the amide and carbonyl groups in this sheet region are
oriented as if interacting with the solvent. Long range connectivities
in Sp1f3 also define an antiparallel orientation for its two
-strands. The antiparallel
-sheet of Sp1f3, however, seems
to be much more open with no backbone-backbone hydrogen bonds evident
in the average refined structure, as observed previously in the case of
ADR1 and human enhancer proteins (30, 34).
The results of NMR spectroscopy and
Co2+ titration studies show that the increased size of the
Cys-X4-Cys chelate of Sp1f2 does not
dramatically alter either the local geometry or the metal affinity,
relative to the more predominant Cys-X2-Cys
subclass. There are, however, some differences in the turn
conformations of the two zinc finger domains despite the fact that both
turns require the two Cys residues to be positioned for tetrahedral metal binding. The Sp1f2 turn contains the residues
Trp7 (i)-Ser8 (i + 1)-Tyr9 (i + 2)-Cys10 (i + 3), and the first metal coordinating Cys is
completely excluded from the reverse turn (Fig.
5A). The turn is best classified as a
-type II structural element based on the dihedral angle values observed for the average, refined structure (Table II).
Hydrogen bonds are observed between Trp7(CO) and
Cys10(NH) (distance
H···
= 3.73 Å, angle N-O-C = 109°) in 50% of the structures. The
average refined structure, Sp1f2·avg·min, also shows the
S
of Cys10 within H-bonding distance of the
backbone NH of Lys12 (distance
H···
= 3.4 Å, angle N-H-S = 124°). The turn conformation is further stabilized by the stacking of
the Trp7 ring against the His27 imidazole ring
(Fig. 6A). The orientation of the Trp indole
ring with respect to His27 is borne out by the dramatic
upfield shift for His27
-protons due to ring current
effects (Fig. 2A).
|
The Sp1f3 turn includes residues
Cys5(i)-Pro6(i + 1)-Glu7 (i + 2)-Cys8 (i + 3) and is best classified as a -type I
structural element, based on dihedral angle values (65) (Table II). The
Sp1f3
-turn shows the expected hydrogen bond between CO of
residue i (Cys5) and NH of residue i + 3 (Cys8)
in all calculated structures (distance
H···
= 3.2 Å, angle N-O-C = 120°). In addition, the structures place the S
of
Cys5 within hydrogen bonding distance of the backbone NH
groups of Glu7 (distance
H···
= 3.6 Å, angle N-H-S = 136°) and Cys8 (distance
H···
= 3.6 Å, angle N-H-S = 157°), thus further stabilizing the turn conformation (Fig.
5B). These NH···S hydrogen bond geometries are similar
to those observed in ferrodoxin and rubredoxin (66). Also, the putative
NH···S hydrogen bonds in Sp1f3 correspond to the
and
bonds in the context of the SPXX motif (51), except that
the Ser residue O atom is replaced by a Cys5 S atom. The
angle for the i + 2 Glu7 in Sp1f3 differs
significantly from the expected value of 0° (Table II) (67), perhaps
due to the fact that Glu7 is involved in a long range
hydrophobic interaction with His23.
While it is not common for NH···S hydrogen bonds to occur in Cys residues involved in disulfide bridges, numerous NH···S bonds, as observed for Sp1f3, are a common feature in proteins coordinating metal ions (66). These bonds are hypothesized to play an important role in stabilizing the ligand arrangement required for metal coordination, thereby minimizing the entropy change caused by metal coordination to the apo protein.
Hydrophobic CorePacking of the -sheet and
-helix of
zinc finger domains against each other forms a hydrophobic core and
places the conserved Cys and His residues toward the interior of the
domain in a position to coordinate a Zn2+ ion. The
experimental NOE constraints unambiguously determine the absolute
chirality around the zinc ion as S, following earlier convention (27).
Several residues (Phe14, Lys12,
Trp7, and Lys24 for Sp1f2;
Phe12, Lys10, Glu7, and
Ile22 for Sp1f3) serve to shield the zinc ion from
solvent and may therefore stabilize the metal-ligand interaction by
precluding close approach of alternative donor ligands. The occurrence
of such hydrophobic shells surrounding metal binding sites is well known and believed to play a key role by not only minimizing the change
in conformational entropy upon metal binding by preordering the primary
coordination sphere but also by precluding alternative modes of metal
binding through the reduction of heteroatoms in the vicinity of the
primary coordination sphere.
In addition to the expected packing interactions between aromatic and
other hydrophobic side chains (Figs. 6, A and B
and 7, A and B), alkyl methylene groups of
certain long chain polar residues seem to be involved in hydrophobic
interactions as well. For instance, in Sp1f2, the alkyl chain of
Lys12 packs against the central Phe14 and
His23, whereas the polar amine is oriented toward the
solvent. Similarly, methylene groups of Glu19 and
Lys24 (not visible) pack against Phe14 and
His23, respectively, with their charged groups pointing
outwards (Figs. 6A and 7A). A similar arrangement
is seen for the side chain of Lys10 in Sp1f3 (Figs.
6B and 7B). Thus, despite their relatively small size, the zinc finger domains achieve a relatively high degree of
packing and are stable as mini-globular domains in the presence of zinc
and other divalent metal ions (Fig. 7, A and
B).
Sp1-DNA Interactions
In the Zif268·DNA complex crystal
structure the zinc fingers bind DNA by docking their -helices in the
major groove such that each zinc finger makes contact with the G-rich
strand of the appropriate cognate 3-base pair subsite using residues at positions
1, +3, or +6 relative to the start of the
-helix (32). Fingers 1 and 3 of Zif268 use Arg residues
1 and +6 to contact the
underlined residues of the
C
subsite, and finger 2 uses residues Arg (
1) and His (+3) to contact the underlined bases of
G
subsite. An analysis of the Sp1 sequence, considering the residues analogous to the ones involved in DNA binding
in the Zif268 structure, reveals striking similarities between the zinc
fingers of Sp1 and Zif268 (36). In light of these similarities, we
proposed a model for interaction of Sp1 with DNA based on the
Zif268/DNA co-crystal structure which envisaged Sp1f2, by analogy
to Zif268 finger 1, using Arg (
1) (residue 16) and Arg (+6) (residue
22) for
C
recognition and Sp1f3, by analogy to Zif268 finger 2, using Arg (
1)
(residue 14) and His (+3) (residue 17) for G
recognition (Fig. 8) (36, 68).
To test the validity of the above-mentioned model and to gain further
insight into the role of individual amino acid residues, we
superimposed the average refined structures of Sp1f2 and
Sp1f3 on Zif268 fingers 1 and 2 in the Zif268-DNA crystal
structure (69). As required by the model, the structures of Sp1f2
and Sp1f3 were found to be very similar to Zif268 fingers 1 and
2, respectively (Fig. 9, A and B).
The backbone RMSD values for residues 3-25 of Sp1f3 (excluding
the residue immediately succeeding the second metal binding Cys) and
the corresponding 22 residues of Zif268 finger 2 is 0.90 Å. The
backbones deviate significantly at the residue succeeding the second
Cys, due to the difference in angles at this position. Sp1f3
contains a Pro residue at this site that has its
angle restricted
to
60°, whereas most zinc fingers, including Zif268 finger 2, exhibit a positive
angle for the residue immediately succeeding the
second Cys location (69). The best overlay of Sp1f2 and Zif-268
finger 1 between the first conserved hydrophobic residue to the second
His residue (excluding the four residues in the Cys-Cys loop) gives an
RMSD value of only 0.68 Å. The residues of the
Cys-X4-Cys loop were excluded from the RMSD
calculation since these residues were stated to be ill-defined in the
crystal structure (32).
Overlay of Sp1f2 with Zif-268 finger 1 in the co-crystal
structure almost exactly overlaps the backbone atoms of their
respective -helices, and even though the side chains of residues
1
(Arg16) and +6 (Arg22) of Sp1f2 are not
constrained in the NMR structures, their backbone
carbons are
positioned such that the side chains start out pointing toward the
major groove of DNA (Fig. 9A) in a manner that is consistent with their proposed interaction with DNA bases, which would be the
underlined guanines of the
C
subsite if we
assume that Sp1f2 docks in the major groove in an orientation
similar to that observed for Zif268 finger 1 in the crystal
structure.3 This mode of interaction is
also consistent with the protection/interference data (70) and the
complete conservation of the underlined guanines of the subsite
C
in all Sp1 sites
identified to date (10). In contrast to Arg (
1) and Arg (+6), Glu
(+3) (residue 19) of Sp1f2, whose side chain is relatively well
defined in the NMR structures (average RMSD of side chain carbon
atoms = 1.07 Å), does not point toward the major groove but
instead packs its methylene
-protons against the central Phe in a
manner similar to the Zif268 Glu (+3) residue (Figs. 6A and
7A), suggesting that it may not interact directly with DNA.
In fact when all the 20 structures of Sp1f2 were overlaid on
Zif268 finger 1, as described before, the Glu (+3) side chain did not
come within interacting distance of any base of DNA in 19 of the 20 models generated. This is consistent with the reported absence of
methylation protection of the middle C position of the 3-base pair
C
subsite. Gln (+5)
(residue 21) is another hydrophilic residue in the
-helix that is
relatively well defined in the NMR structures (average RMSD of side
chain carbon atoms = 1.08 Å) and does not point toward the major
groove. This residue, instead, folds back along the backbone (Figs.
6A and 7A) and appears to place its side chain
amide proton within hydrogen bonding distance of Ser17 side
chain O
and carbonyl O atoms, although it is difficult
to clearly identify the interacting partner. Thus we see that the model
generated by the overlay and the side chain packing arrangement is
consistent with Arg16 and Arg22 interacting
with DNA bases but tends to preclude the possibility of the other two
polar residues on the
-helix (Glu19 and
Gln21) being involved in direct base recognition.
Similar overlay of Sp1f3 with Zif268 finger 2 (Fig.
9B) positions Arg (1) (residue 14) and His (+3) pointing
(residue 17) toward the major groove consistent with proposed DNA
contacts. The Sp1f3/DNA model, however, does not preclude the
interaction of Lys (+6) (residue 20) with DNA. The corresponding
residue in Zif268 finger 2 is a Thr which is too short to contact DNA.
Protection/interference data for Sp1 indicate that the guanine base
expected to be contacted by Lys (+6) (
GG) is
only weakly interacting and can be replaced by a thymine residue. The
reason for this apparently weak interaction is not clear and may be
entirely due to the side chain length of Lys being incompatible with
the shorter DNA binding His at +3 position for making simultaneous DNA
contacts (71).
The oxygens of the carboxylate group of
Asp (+2) were found to be in a hydrogen bond-salt bridge interaction
with N group of Arg (
1) in all the three zinc fingers
of Zif268 crystal structure. There is NMR spectral evidence suggesting
that a similar interaction exists between the analogous side chains
Arg16-Asp18 of Sp1f2 and
Arg14-Asp16 of Sp1f3 even in the absence
of DNA (69). In Sp1f3 the N
H proton of
Arg14 gives intense TOCSY cross-peaks with neighboring
protons in the side chain and is shifted downfield to 8.02 ppm from the
random-coil value of 7.20 ppm (Fig. 2B) suggesting that the
Arg14 N
H proton is protected from exchange
and probably involved in a hydrogen bond. Arg14 is also the
only long hydrophilic side chain to show a large chemical difference
between the two diastereotopic methylene protons of the terminal
CH2 group (
ppm (C
H2) = 0.50 at 5 °C) (Fig. 2B). This large chemical shift difference indicates a well defined solution conformation for Arg14
side chain. Furthermore, inspection of the NMR spectra revealed a
moderately strong NOE between the C
H proton of
Arg14 and C
H proton of Asp16
(this NOE was not included in the structure calculations). The above-mentioned facts taken together strongly suggest that the Arg14 N
H proton is involved in a hydrogen
bond-like interaction, in all probability with the side chain of
Asp16, even though this interaction is not apparent in all
the individual structures due to lack of NOE constraints. The
Asp18 of Sp1f2 is also capable of having a similar
interaction with Arg16. Again the terminal
N
H of Arg16 (
ppm
(C
H2) = 0.06 at 15 °C) (Fig.
2A) gives an intense and considerably downfield shifted
resonance in the NMR spectra indicating that Arg-Asp interaction also
exists in Sp1f2 free in solution unbound to DNA. This Arg-Asp
interaction is presumed to stabilize the long side chain of Arg and
enhance the specificity of arginine-guanine interaction. The presence
of this interaction in Sp1f2 and Sp1f3 further implicates
Arg (
1) of both domains in DNA binding.
We have also acquired NMR spectra and obtained backbone assignments of an over-expressed peptide fragment containing both the zinc finger domains 2 and 3 (Sp1f23). Sp1f23 then represents two-thirds of the DNA binding domain of Sp1 and has been shown to be capable of binding DNA in a Zn2+-dependent, sequence-specific manner.4 We found that for most part the NMR spectrum of Sp1f23 construct is close to the sum of the NMR spectra of Sp1f2 and Sp1f3 (except for the residues in the linker between the two domains), indicating negligible domain-domain interactions while free in solution. Since chemical shifts are very sensitive to local structure, this further supports the idea that the single finger structures are very relevant in context of larger domains and can serve as reasonable models to understand the mode of sequence-specific DNA interaction of entire multifinger constructs.
Furthermore, this essentially allows the transfer of Sp1f2 and Sp1f3 assignments to the larger Sp1f23 fragment, thereby greatly facilitating assignment of the entire DNA binding domain of Sp1 in a modular fashion.
ConclusionsIt is our aim to understand the chemical basis of
the unique Sp1 DNA recognition process at the molecular level. Towards
this objective, we have solved the solution structures of synthetic peptides corresponding to zinc finger domains 2 and 3 of Sp1, using
homonuclear two-dimensional NMR spectroscopy. Circular dichroism studies and Co2+ titration experiments show that both
peptides assume a folded conformation around a tetrahedral metal center
with no significant differences in the metal binding affinities between
the Cys-X4-Cys (Sp1f2) and
Cys-X2-Cys (Sp1f3) subclasses. Sp1f2
has a stable -type I turn between the two strands with the first Cys
residue excluded from the turn motif due to the longer
-X4- loop. Sp1f3 contains the sequence
Cys-Pro-Glu-Cys-Pro in which the first Pro causes the turn to closely
resemble the SPXX motif both in geometry and hydrogen
bonding pattern. The second Pro forces the
angle at that position
to be fixed at about
60 degrees, contrary to the expected positive
value at this position. This may be the reason for the antiparallel
-sheet being more open in Sp1f3. The NMR solution structures
show several relatively well defined polar side chains making
hydrophobic contacts with the central apolar residues via their alkyl
methylene groups or aromatic rings while pointing their charged atoms
away toward the solution. Such residues include Lys12 and
His23 of Sp1f2 and Lys10 and
His21 of Sp1f3. It is interesting to note that polar
groups of residues corresponding to these very positions show
interactions with DNA backbone phosphates in the Zif268/DNA
co-crystal structure. Since these side chains seem to have a relatively
fixed orientation even free in solution, these interactions could play
an important role in correctly docking and orienting the zinc fingers
in the major groove of DNA.
The comparison of NMR spectra of Sp1f23 with those of Sp1f2 and Sp1f3 supports the idea that zinc fingers fold as independent, noninteracting entities with structures very relevant in the context of the entire protein bound to DNA. The free solution structures of Sp1f2 and Sp1f3 are very similar to those of analogous zinc finger domains of Zif268 bound to DNA. Modeling Sp1-DNA complex by overlaying the Sp1f2 and Sp1f3 structures on Zif268 fingers 1 and 2, respectively, predicts the role of key amino acid residues, the interference/protection data, and is consistent with the model of Sp1-DNA interaction proposed earlier. Interestingly, the Arg-Asp buttressing interaction observed in Zif268/DNA crystal structure also seems to be preserved in Sp1f2 and Sp1f3, free in solution. The presence of this interaction in single zinc fingers without DNA further strengthens the emerging theme that zinc fingers are preformed, prearranged motifs, ready to interact with DNA even at the level of individual subdomains. Thus we expect only a very small entropic cost to be associated with sequence-specific recognition of DNA by Sp1 zinc fingers two and three.
In conclusion, the structures of Sp1f2 and Sp1f3 presented above are important steps toward understanding the DNA binding domain of Sp1. These data offer insight into both the structural features of zinc fingers and the mechanisms of sequence-specific interaction of Sp1 with DNA. Work is in progress using both mutagenesis and NMR spectroscopy to further characterize and understand the Sp1-DNA recognition process and define the chemical basis of the unique features of Sp1 binding.
The coordinates of the average refined structures have been deposited in the Brookhaven Protein Data Bank under the file names Sp1f2·avg and Sp1f3·avg.
We thank Prof. James P. Prestegard for useful comments and discussions, Ranajeet Ghose for helpful advice with NMR experiments, and Xiaohong Cao for providing us with purified Sp1f23.