(Received for publication, December 13, 1996, and in revised form, February 3, 1997)
From the Institute of Organic Chemistry and the
Institute of Biochemistry, University of Innsbruck,
A-6020 Innsbruck, Austria
Proteins of the cysteine-rich protein (CRP)
family (CRP1, CRP2, and CRP3) are implicated in diverse processes
linked to cellular differentiation and growth control. CRP proteins
contain two LIM domains, each formed by two zinc-binding modules of the
CCHC and CCCC type, respectively. The solution structure of the
carboxyl-terminal LIM domain (LIM2) from recombinant quail CRP2 was
determined by multidimensional homo- and heteronuclear magnetic
resonance spectroscopy. The folding topology retains both independent
zinc binding modules (CCHC and CCCC). Each module consists of two
orthogonally arranged antiparallel -sheets, and the
carboxyl-terminal CCCC module is terminated by an
-helix.
15N magnetic relaxation data indicate that the
modules differ in terms of conformational flexibility. They pack
together via a hydrophobic core region. In addition, Arg122
in the CCHC module and Glu155 in the CCCC module are linked
by an intermodular hydrogen bond and/or salt bridge. These residues are
absolutely conserved in the CRP family of LIM proteins, and their
interaction might contribute to the relative orientation of the two
zinc-binding modules in CRP LIM2 domains. The global fold of quail CRP2
LIM2 is very similar to that of the carboxyl-terminal LIM domain of the
related but functionally distinct CRP family member CRP1, analyzed
recently. The carboxyl-terminal CCCC module is structurally related to
the DNA-binding domain of the erythroid transcription factor GATA-1. In
the two zinc-binding modules of quail CRP2 LIM2, flexible loop regions
made up of conserved amino acid residues are located on the same side
of the LIM2 domain and may cooperate in macromolecular recognition.
Tetrahedral zinc-binding domains are important structural elements in a wide variety of proteins, and more than 10 different classes of such Zn(II)-binding motifs have been identified and biochemically characterized, many of them in proteins specifically interacting with nucleic acids (1, 2). The four coordinating ligands in the tetrahedral zinc-binding sites are composed of cysteine sulfur, histidine imidazole nitrogen, or, occasionally, oxygen from a glutamate or aspartate side chain. The LIM1 motif defines one class of zinc-binding domain and was originally recognized in, and named after, the protein products of the lin-11, isl-1, and mec-3 genes (3, 4). The gene products of lin-11 and mec-3 transcriptionally regulate genes involved in cell fate determination and differentiation in Caenorhabditis elegans, and the isl-1 gene encodes a rat insulin I gene enhancer-binding protein. LIM domains are found in 1-5 copies in many different proteins of diverse functions, either alone or associated with distinct domains of defined function like homeodomains or protein kinase domains (5-7). The LIM motif is basically composed of two zinc finger structures separated by a 2-amino acid spacer and conforms to the consensus sequence CX2CX16-23HX2CX2CX2CX16-21CX2(C/H/D) (5-7). Spectroscopic studies of LIM domains derived from different LIM proteins revealed that each double finger LIM domain specifically binds two zinc ions (8-11). A distinct family of genes, the CSRP genes, encode a specific class of LIM proteins, termed cysteine-rich proteins (CRPs) (12). CRP proteins contain 192-194 amino acid residues and exhibit two LIM domains, termed LIM1 (amino-terminal) and LIM2 (carboxyl-terminal). CRP LIM1 and LIM2 domains invariably conform to the 52-amino acid consensus sequence CX2CX17HX2 CX2CX2CX17CX2C and are separated from each other by 56-59 amino acids (12). Each CRP LIM motif contains two tetrahedral Zn(II)-coordinating modules, an amino-terminal S3N1 site of the CCHC type, and a carboxyl-terminal S4 site of the CCCC type (8-10).
The expression patterns of CSRP genes and the structural properties of their CRP protein products suggest that these genes may have important roles in the regulation of cell differentiation and proliferation. The CSRP1 gene was shown to have properties typical for a primary response gene (13, 14) and its protein product, CRP1, was found to be associated with specific components of the cytoskeleton (15, 16). The CSRP2 gene encoding the CRP2 protein was discovered on the basis of its strong suppression in avian fibroblasts transformed by retroviral oncogenes or chemical carcinogens (17). The suppression of CSRP2 gene expression directly correlates with the transformed phenotype of avian fibroblasts in a conditional transformation system (12) and with the proliferative state of rat arterial smooth muscle cells after mitogenic stimulation (18). The CSRP3 gene was isolated on the basis of its induced expression during rat skeletal muscle differentiation, and its protein product, CRP3 (or MLP for muscle LIM protein), was shown to be a positive regulator of myogenesis (19). In pairwise alignments, the avian homologs of the three members of the CRP family of LIM proteins share 63-76% identical residues in their amino acid sequences and hence represent related but distinct members of this protein family (12). The precise biochemical function of LIM domains in general and of CRP proteins in particular has not been defined yet. The solution structure of the carboxyl-terminal LIM domain of chicken CRP1 was determined by nuclear magnetic resonance spectroscopy, and the protein fold of the tetrathiolate CCCC module was shown to be strikingly similar to that reported for the DNA-interactive CCCC modules within the DNA binding domains of the erythroid transcription factor GATA-1 and of the glucocorticoid receptor (20). Despite this modular structural similarity to DNA-binding proteins, specific interaction of CRP proteins with nucleic acids has not yet been demonstrated. On the contrary, it has been inferred from protein affinity assays that CRP LIM domains are involved in specific protein-protein interactions (21-23).
So far, the solution structures of three LIM domains from unrelated LIM proteins have been determined by nuclear magnetic resonance spectroscopy: the carboxyl-terminal LIM domain (LIM2) from chicken CRP1 (20), the single LIM domain from the developmentally regulated rat cysteine-rich intestinal protein (CRIP) (24), and the amino-terminal CCHC Zn(II)-binding module of the single LIM domain from the Lasp-1 protein encoded by a gene that was identified on the basis of its overexpression in human breast carcinoma (25). Here we present the solution structure of the carboxyl-terminal LIM domain (LIM2) from quail CRP2 and assess structural conservation and diversity between closely related but distinct members of the CRP family of LIM domain proteins that apparently fulfill diverse functions in cellular differentiation and growth control.
A polymerase chain reaction was performed using
DNA from the gt10 clone W15 containing quail CSRP2
(qCSRP2) cDNA (17) as a template and the
oligonucleotides 5
-d(CTAACCATGGACAGGGGAGAG)-3
(SW001) and
5
-d(CTTATGAGTATTTCTTCCAGGGTA)-3
(
gt10 reverse sequencing primer)
as 5
and 3
primers, respectively. The SW001 primer corresponds to
nucleotides 245-265 of the published qCSRP2 cDNA
sequence (17) with nucleotide substitutions at its 5
end introducing a
novel NcoI site. The polymerase chain reaction product was
first digested with HindII cutting at a site in the
3
-untranslated region of CSRP2 cDNA and then partially
digested with NcoI to cut at the site generated by primer
SW001 but preserving an internal NcoI site. The
435-nucleotide fragment was ligated into expression plasmid pET3d (26),
which had been cut by BamHI, filled in by Klenow DNA
polymerase, and then digested by NcoI. To rule out polymerase chain reaction-induced mutations and to verify the integrity
of the CSRP2 coding region, the total nucleotide sequence of
the inserted polymerase chain reaction fragment was determined by the
dideoxynucleotide chain termination method using the T7 sequencing kit
(Pharmacia, Vienna, Austria) and pET-specific primers. The expression
plasmid pET3d-qCRP2(LIM2) encodes a 113-amino acid peptide encompassing
amino acids 82-194 of qCRP2 including the carboxyl-terminal LIM domain
(LIM2) (12, 17).
For the
expression of the qCRP2(LIM2) protein in bacteria, pET3d-qCRP2(LIM2)
was transformed into Escherichia coli strain BL21(DE3)pLysS
(26). Bacteria were grown at 37 °C in NZCYM medium containing
ampicillin (100 µg/ml) and chloramphenicol (25 µg/ml) to an optical
density at 600 nm of 0.5. Cells were induced to express qCRP2(LIM2) by
the addition of isopropyl--D-thiogalactoside (Boehringer, Vienna, Austia) to a final concentration of 0.5 mM, and incubation was continued for 3 h at 37 °C.
The cells were collected by centrifugation and resuspended in 20 ml of
ice-cold buffer A (50 mM sodium phosphate, pH 6.4, 10 mM NaCl, 0.1% (v/v)
-mercaptoethanol) per liter of the
original bacterial culture. All subsequent steps were carried out at
4 °C. Bacteria were lysed by a freeze (
80 °C)-thaw cycle, and
the cell lysate was cleared by centrifugation at 23,000 × g for 35 min. The supernatant containing the soluble protein
fraction was loaded onto a CM-52 cation exchanger (Whatman, Maidstone,
United Kingdom) column equilibrated in buffer A. The column (bed
volume, 35 ml) was washed with approximately 50 ml of buffer A until
the eluting solution showed background absorbance at 260 and 280 nm.
Elution of qCRP2(LIM2) was achieved with 30 ml of buffer B (50 mM sodium phosphate, pH 8.0, 10 mM NaCl, 0.1%
(v/v)
-mercaptoethanol). Pooled protein-containing fractions of the
eluate were analyzed by a photometric assay (27) and by
SDS-polyacrylamide gel electrophoresis (15%, w/v) to determine protein
concentration and purity, respectively. The final yield of homogeneous
qCRP2(LIM2) was approximately 12 mg/liter of bacterial culture. The
structural integrity and purity of the protein preparation was verified
by amino-terminal sequencing, and the stoichiometry of bound zinc ions
was analyzed by atomic absorption spectroscopy and electrospray
ionization mass spectrometry.
Concentration of protein solutions for NMR analysis was achieved by dialysis against buffer C (20 mM potassium phosphate, pH 7.2, 50 mM KCl, 0.5 mM dithiothreitol) and centrifugation of the dialyzed solution through Centriprep 10 ultrafiltration filters (Amicon, Witten, Germany). The final protein concentrations of qCRP2(LIM2) solutions used for NMR analysis ranged from 1.2 to 2.2 mM (14.5-26.6 mg/ml).
15N labeling of qCRP2(LIM2) was performed by growing
bacteria in minimal medium (4.8 g of Na2HPO4,
3 g of KH2PO4, 0.5 g of NaCl, 1 g of 15NH4Cl in 1 liter of water) supplemented
with 20 ml of an 18% (w/v) glucose solution, 2 ml of 1 M
MgSO4, 4 ml of 10 mM ZnSO4, and ampicillin and chloramphenicol to final concentrations of 100 µg/ml
and 25 µg/ml, respectively. 15NH4Cl (98%
isotope purity) was purchased from CIL (Andover, MA). After reaching an
optical density of 0.5 at 600 nm, cells were induced to express
15N-labeled qCRP2(LIM2) by the addition of
isopropyl--D-thiogalactoside to a final concentration of
0.5 mM, and incubation was continued for 5 h at
37 °C. The purification procedure was as described above. The final
yield of purified 15N-labeled qCRP2(LIM2) was approximately
25 mg/liter of bacterial culture.
NMR experiments were performed on a Varian
UNITYPlus 500-MHz spectrometer equipped with a pulse field gradient
unit and triple resonance probes with actively shielded z
gradients. The NMR sample contained 1-2 mM qCRP2(LIM2), 20 mM potassium phosphate, pH 7.2, 50 mM KCl, 0.5 mM dithiothreitol in 90% H2O, 10%
2H2O. NMR spectra were processed and analyzed
using Varian Vnmr and NMRPipe software systems (28). Spectra recorded
for spin system identification and sequential assignment include SS
NOESY (150, 200 ms) (29), TOCSY (45, 75 ms) (30), two-dimensional 15N-filtered and 15N-edited NOESY (150 ms) (31,
32), sensitivity-enhanced two-dimensional 15N HSQC (33),
three-dimensional 15N TOCSY-HSQC, and three-dimensional
15N NOESY-HSQC (34). Spectra were recorded at 26 and
35 °C in order to resolve the residual water signal from some
H protons. The NMR properties of the protein did not
change significantly over this temperature range. Signal assignment was
carried out at 26 °C. Two-dimensional homonuclear and heteronuclear
experiments were processed with shifted gaussians in both dimensions.
The TOCSY spectrum resulted in a 512 × 1024 data matrix with 32 scans per t1 value, using a WATERGATE (35)
double echo sequence for water suppression and a DIPSI-2 (36) mixing
sequence. A two-dimensional NOESY spectrum was collected using a
z filter prior to acquisition and a 300-ms selective
excitation pulse (29). The NOESY and the two-dimensional
15N-edited and 15N-filtered NOESY spectra
resulted from a 512 × 1024 data matrix with 32 scans per
t1 value. The one-bond
1H-15N shift correlation (HSQC) spectrum (33)
of qCRP2(LIM2) resulted from a 2 × 64 × 1024 data matrix
size, with 16 scans per t1 value and a delay
time between scans of 1 s. Decoupling (during acquisition) was
achieved with the use of the GARP decoupling sequence (37), using a
1.5-kHz radio frequency field. Shifted squared sine bell windows were
used both in t1 and t2.
The three-dimensional TOCSY-HSQC and three-dimensional NOESY-HSQC
experiments were performed with water flip-back pulse (38) and PFG
sensitivity enhancement (33). The data (64 × 32 × 1024)
were doubled by linear prediction in both indirect dimensions,
processed using 80° shifted squared sine bells, and zero-filled to
256 (t1) and 128 (t2)
points, respectively. DIPSI-2 and NOESY mixing times were set to 50 and
120 ms, respectively. Qualitative 3J
(HNH
) scalar coupling information was
obtained from one-dimensional traces through the standard
two-dimensional PFG 15N HMQC experiment (39). Data size was
512 × 1024, with 16 scans per t1 value.
For the measurements of the 15N attenuation factors, two
sets of spectra with and without presaturation of the water signal were
recorded with parameters identical to the PFG sensitivity-enhanced
two-dimensional 15N HSQC experiment (33). The attenuation
factor is given as the ratio of the signal intensities in these two
experiments (40). Dynamic information was obtained by measuring
15N T1 and T2
relaxation, as described by Farrow et al. (41) and analyzed
according to Habazettl and Wagner (42). Experimental and processing
parameters were identical to the PFG sensitivity-enhanced two-dimensional 15N HSQC experiment. Relaxation delays were
0, 45, 90, 135, and 180 ms for the T2 and 0, 150, 300, 450, and 600 ms for the T1 measurements, respectively. Relaxation times were obtained by measuring
peak heights using nonlinear least squares curve fitting (28) with
three adjustable parameters.
Three-dimensional structures were
generated using experimentally observed NOE constraints in a simulated
annealing and energy minimization protocol (43) using the program
X-PLOR (44) on SGI Crimson and Indigo2 workstations. In the first stage
of calculation, 247 inter- and intra-NOE restraints were applied to a
template structure with randomized and
angles and extended side
chains to generate a set of 100 structures. NOE constraints were
classified as strong (1.8-3.0 Å), medium (1.8-4.0 Å), and weak
(1.8-5.0 Å). Initial structure calculations were performed without
the zinc ions. However, the zinc coordination sites were defined by
enforcing tetrahedral geometry of residues Cys120,
Cys123, His141, and Cys144 and of
residues Cys147, Cys150,
Cys168, and Cys171,
respectively. Upper and lower distance limits for the zinc
coordination site were (in Å) 3.30
S
-S
3.50, 3.30
S
-N
1
3.50. 36 structures with minimal
constraint violations were selected. In the next step, upper and lower
bounds were further refined, accounting for the respective maximum and
minimum distances from the 36 structures. During the final refinement,
the number of distance constraints was increased to 393, as well as 17 dihedral angle constraints. The resulting 15 structures with minimal
constraint violations were used for final refinement using a restrained
Powell energy minimization with the CHARMM force field (45). The final refinement also included 19 hydrogen bonding restraints, based on
measured attenuation factors (NH exchange rates), distance restraints
defining the acceptor for hydrogen bonds, and force field parameters
given by Lee et al. (46). In particular, zinc was covalently
attached to S
of Cys and to N
1 of His.
Zinc-ligand bonds were assigned equilibrium distances of 2.30 Å (Zn-S
) and 2.00 Å (Zn-N
1) and a force
constant of 200 kcal/(mol·Å2). Angles centered on the
coordinating heteroatoms and the metal atoms were defined as follows
(45): Zn-S
-C
, 107.94°, 40 kcal/(mol·rad2);
Zn-N
1-C
1, 120°, 40 kcal/(mol·rad2); S
-Zn-S
,
109°, 40 kcal/(mol·rad2);
S
-Zn-N
1, 109°, 40 kcal/(mol·rad2). Structure superposition, calculation of
r.m.s. deviation values and visualization were accomplished using
the software package MolMol (47). The coordinates have been deposited
in the Brookhaven Protein Data Bank.
A constructed
derivative of the pET3d expression vector directed the synthesis of a
113-amino acid peptide with a calculated Mr of
12,105 and an estimated isoelectric point of 9.35 encompassing amino
acids 82-194 of quail CRP2 (qCRP2) including the carboxyl-terminal LIM
domain (LIM2). The highly soluble recombinant protein was purified to
homogeneity in a single step employing CM-52 cation exchange
chromatography. The identity and purity of recombinant qCRP2(LIM2) was
confirmed by amino-terminal amino acid sequencing, which revealed that
a minor portion (<5%) of the protein preparation lacked the
initiating methionine. Atomic absorption spectroscopy showed that
purified recombinant qCRP2(LIM2) contained 1.8 ± 0.1 mol of
zinc/mol of protein. An alignment of the amino acid sequence of the
qCRP2(LIM2) peptide used in this study with the sequence of the
corresponding segment from chicken CRP1 is shown in Fig. 1A. The sequence identity within this region
is 77.5%, while between the native chicken CRP1 and quail CRP2
proteins it is 76.6% (chicken and quail CRP1 proteins are identical,
and chicken and quail CRP2 proteins differ by a single amino acid
substitution) (12). The schematic structure of the quail CRP2 LIM2
domain with the CCHC and CCCC zinc-binding modules is shown in Fig.
1B.
NMR Analyses
The chemical shift dispersion in a
sensitivity-enhanced PFG two-dimensional 15N-1H
HSQC spectrum (Fig. 2) indicates that qCRP2(LIM2) adopts
a well folded structure in aqueous solution; however, only a fraction of all NH signals are visible, presumably due to fast exchange with
bulk water at this pH (7.2) and conformational exchange contributions. A total of 61 secondary backbone amides are visible. Examination of the
spectrum indicated that only one structural form of the protein was
present in solution, since there is one cross-peak for the amide of
each non-proline residue in the protein. Signal assignment followed the
well established strategy (48). In the three-dimensional TOCSY-HSQC,
spectral identification of the spin systems was straightforward. The
sequential assignment was achieved by combining information from
two-dimensional homonuclear NOESY and three-dimensional 15N
NOESY-HSQC experiments. To further disentangle NOESY peaks in the
downfield region of the proton spectra, isotope filter techniques were
applied to selectively monitor NOEs involving either amide or aromatic
protons. Both, 15N-filtered (aromatic-aliphatic NOEs, Fig.
3) and 15N-edited (amide-aromatic/aliphatic
NOEs) two-dimensional NOESY spectra were recorded. In some cases, side
chain protons of residues with longer side chains could not be assigned
unambiguously due to signal degeneracies and overlap in the upfield
region of the TOCSY data.
Secondary structure elements were identified primarily on the basis of
cross-peak patterns observed in homonuclear two-dimensional NOESY,
15N-filtered, 15N-edited two-dimensional NOESY
and three-dimensional NOESY-HSQC spectra. Strong sequential
H(i
1)-HN(i)
NOE connectivities were found for residues
(Lys119-Cys120,
Ser126-Tyr128;
Val133-Ala136,
Lys138-His141;
Phe145-Ala148,
Lys152-Leu154;
Thr160-Lys162,
Glu165-Lys169) that exist in extended
antiparallel
-sheet conformations (
-strands
I-
VIII). Turn
regions could be identified by strong sequential HN(i)-HN(i + 1) and
HN(i)-HN(i + 2) NOEs
(Ser121-Asp125;
Lys149-Gly151). The COOH-terminal
-helix
(
I, Cys171-Lys174) was detected by means of
strong sequential
HN(i)-HN(i + 1) NOEs.
Additional information was extracted from proton chemical shift
analysis, 3J(HNH
), and amide
proton attenuation factors. The H
secondary shifts
(i.e. shift difference between experimental H
shift and random coil values) (49, 50) are given in Fig. 4A. There is fairly good agreement between
the secondary shift and the definition of secondary structure elements
from NOE data. In particular,
-strands
II,
III,
IV,
VII,
and
VIII exhibit diagnostic secondary shifts, in contrast to
-strands
I,
V, and
VI.
3J(HNH
) scalar coupling
constants taken from traces in the 15N HMQC spectrum
additionally corroborate the secondary structure assignment.
Significant couplings were found for residues Cys123,
Val127, Tyr128, Glu131,
Ile134, Lys138, Trp140,
Cys144, Lys149, Cys150,
Leu154, Thr160, Glu161,
Lys162, Glu165, Ile166, and
Tyr167 and could be correlated with NOESY information
indicating
-sheet structures. Cys123 is located in a
turn region. Amide proton attenuation factors (i.e. the
retardation of intermolecular exchange of amide protons with bulk
water) were used to monitor hydrogen bonding effects. Fig.
4B shows the measured attenuation factors as a function of residue position. It is evident that there is a significant decrease in
amide proton attenuation factors for residues located in loop regions
of qCRP2(LIM2). This indicates greater solvent exposure and facilitated
solvent accessibility of these protons compared with residues found in
secondary structure elements. Of particular interest are the
significantly higher attenuation factors for amide protons of residues
Cys120, Val127, Val133,
Ile134, Phe145, Lys162,
Glu163, Glu165, Ile166,
Cys171, Tyr172, and Ala173. They
correspond to well defined hydrogen bonds within secondary structure
elements, both
-sheet structures and the carboxyl-terminal
-helix.
15N magnetic relaxation data were interpreted in terms of
the simplified method proposed by Habazettl and Wagner (42) and helped
identify secondary structure elements. 2/T2 1/T1 values are sensitive to slow motions on
millisecond to microsecond time scales. Fig. 4C shows the
distribution of 2/T2
1/T1 along the backbone of qCRP2(LIM2). There is
a good correlation between backbone 15N dynamics (Fig.
4C), secondary structure, and hydrogen exchange (Fig.
4B). Residues within loop regions exhibit significantly higher 2/T2
1/T1
values compared with residues located in secondary structure elements.
Since hydrogen exchange depends on structural isomerization processes
(i.e. the breakage of a blocking hydrogen bond), these
values indicate conformationally flexible sites that transiently form
open, unprotected states in which the exchangeable amide protons are
accessible to the hydrogen exchange catalyst. In structured regions of
the COOH-terminal CCCC module, the 2/T2
1/T1 values are quite uniform, the average value
being 8.3 Hz. The CCHC module generally exhibits higher
2/T2
1/T1 values
(average of 9.2 Hz), suggesting a less rigid structure for the CCHC
module compared with that of the CCCC module. However, to more
specifically define the motional characteristic of qCRP2(LIM2)
(e.g. fast motion), heteronuclear
1H-15N NOE data and a more elaborate model
including motional anisotropy must be used.
Structural information in the form of distance constraints was derived
from two-dimensional NOESY and three-dimensional NOESY-HSQC spectra.
From the two-dimensional 15N-filtered NOESY experiment
(Fig. 3) additional distance constraints involving aromatic and
aliphatic protons in the CCHC module could be obtained. Structure
calculations were performed in two consecutive steps (see
"Experimental Procedures"). The final structures have no NOE
violations greater than 1.0 Å (Table I). A
superposition of the backbone coordinates from the final 15 X-PLOR
structures is shown in Fig. 5. The average r.m.s.
deviation from the mean structure for ordered backbone atoms (N,
C, and C
), including residues 119-173 of the
qCRP2(LIM2) domain is 0.98 ± 0.29 Å (Table I), which is
comparable with a so-called "second generation structure" (51).
Better convergence is obtained when the amino-terminal CCHC module and
the carboxyl-terminal CCCC module are compared independently (residues
119-144, 0.87 ± 0.22 Å; residues 145-173, 0.81 ± 0.17 Å).
|
Tertiary Structure of qCRP2(LIM2) and Relation to Chicken CRP1(LIM2)
A schematic ribbon drawing of the carboxyl-terminal
domain of qCRP2(LIM2) (residues 118-174) is presented in Fig.
6, A and B. The domain starts out
with an anti-parallel -sheet (residues Lys119-Tyr128), connected via a rubredoxin
type turn ("Rd knuckle") (52), with characteristic hydrogen bonding
between S
Cys120 and HN
Arg122 as well as a hydrogen bond between S
Cys123 and HN Asp125. It is
followed by a second
-sheet, which is oriented perpendicular to the
first one (Lys132-His141). The linker regions
between the two
-sheets as well as that between the two antiparallel
-strands
III and
IV are flexible and appear to have no
noticeable hydrogen bonding. The residues Lys142-Cys144 form a turn and thus complete the
amino-terminal CCHC module. NMR spectral data (low attenuation factor,
high 2/T2
1/T1 value) for Asn143 and Cys144 indicate conformational
flexibility for these residues, although Cys144 is
coordinated to the zinc ion. Similar observations were made for the
residues involved in zinc binding within the zinc finger DNA binding
domain of Xfin (53) and some of the ligand binding cysteines of
E. coli Ada (54). Within the CCCC module, residues Phe145-Leu154 comprise a third anti-parallel
-sheet, again containing a rubredoxin type turn, with a similar
hydrogen bonding pattern between HN Lys149 and
S
Cys147 and between HN
Lys152 and S
Cys150. Following a
conformationally flexible loop from Glu155 to
Thr157, a final antiparallel
-sheet is formed by
residues Thr160-Cys168. This sheet is the most
regular
-sheet structure in both modules, as can be seen not only by
the NOE connectivities, but also by the chemical shift index and scalar
coupling constants (48). A short
-helix starts out at residue
Gly170, although only reasonably well defined within
residues Cys171-Lys174. Residues 82-118 and
175-194 of qCRP2(LIM2) (cf. Fig. 1) were not visible in the
15N HSQC spectra and thus could not be analyzed.
Fig. 7, A and B, shows
superpositions of backbone atoms of the two independent modules CCHC
(residues 119-144, Fig. 7A) and CCCC (residues 145-173,
Fig. 7B) of qCRP2(LIM2) with corresponding atoms of chicken
CRP1(LIM2) (20). Within each module (i.e. CCHC and CCCC)
there is high structural similarity. The r.m.s. deviation for the CCHC
module is 2.10 Å, and for the CCCC module an r.m.s. deviation of 1.35 Å was calculated. As observed for chicken CRP1(LIM2) (20), in
qCRP2(LIM2) the amino-terminal CCHC and carboxyl-terminal CCCC modules
are packed together via a hydrophobic interface, comprising the side
chains of residues Val133, Ala136,
Trp140, Phe145, Leu154,
Leu159, and Ile166. Residues
Trp140, Phe145, Leu154, and
Ile166 have well defined positions.
A number of interesting side chain/side chain interactions were deduced
from superpositions of the 15 final structures, the most noticeable
being the occurrence of a hydrogen bond and/or salt bridge between
Glu155 and Arg122. In the 15N HSQC
spectra, two arginine HN cross-peaks were observed,
which could be assigned to Arg122 and Arg146.
HN
Arg122 appeared at a remarkable low field
(8.60 ppm), and two separate HN
Arg122
resonances (6.93 and 7.33 ppm, from NOESY data) were observed. It was
noted that hydrogen bonding of the proton HN
in the
guanidinium group of Arg leads to a significant downfield shift in the
1H NMR spectrum (55). The complete lack of additional
interresidue HN
Arg122 NOEs indicated that
the hydrogen bond acceptor was not a backbone carbonyl but a side chain
functional group, most likely a carboxyl group. Further inspection
revealed a spatial proximity of Glu155 and
Arg122 side chain functional groups, and thus it was
concluded that HN
Arg122 and/or
HN
Arg122 were forming a hydrogen bond
and/or salt bridge to the side chain carboxyl group of
Glu155. Significantly, these two residues are absolutely
conserved within the CRP family of LIM proteins (12), suggesting that
they are important determinants for the relative orientation of the two zinc finger modules in the CRP LIM2 domains. In contrast, in the CRIP
LIM domain the corresponding amino acid positions are Lys and Thr, and
the orientation of the two modules is different from that in CRP LIM2
domains (24). This may indicate that not only hydrophobic interactions
in the core region but also salt bridges or hydrogen bonds are
important elements contributing to the global fold of the CRP LIM2
domain. There is also evidence (strong NOEs between H
2
His141 and H
,
Glu131) for
hydrogen bonding between HN
2 His141 and the
carboxyl group of Glu131. This was also found for chicken
CRP1 (20) and CRIP (24), and it was suggested that this is an important
interaction for defining the conformation of the CCHC module.
Similar to the CCCC modules of chicken CRP1(LIM2) and CRIP (20, 24),
the CCCC module of qCRP2(LIM2) shows striking structural similarities
to the DNA-interacting CCCC modules of the glucocorticoid receptor and
GATA-1 DNA-binding domains (56, 57) and hence may also form a
DNA-contacting structure possibly involved in CRP-nucleic acid
interactions. Conformational flexibility in the Ser156-Thr158 loop segment of the qCRP2(LIM2)
CCCC module is intriguing, given that it is highly conserved in all CRP
proteins and partially conserved between CRP and the DNA-binding GATA-1
and steroid hormone receptor proteins. Furthermore, the loop segment
Ala129-Glu131 connecting the II and
III
strands in the CCHC module of qCRP2(LIM2) exhibits conformational
flexibility, and this segment is, again, absolutely conserved between
CRP proteins (12). These two conserved loop segments in the CCHC and
CCCC modules, exhibiting conformational disorder, are located at the
same side of the qCRP2(LIM2) molecule (Fig. 6, A and
B), and it is tempting to suggest that their conformational flexibilities may be relevant for the fine tuning of intermolecular interactions with a putative DNA target and optimization of the binding
interface. Further biochemical and structural analyses of CRP proteins,
including analyses of the amino-terminal LIM1 domain and of putative
functional cooperativity between LIM1 and LIM2, will be important to
aid in the unequivocal identification of the cellular targets for these
proteins and to elucidate the molecular basis for their diverse
physiological functions.
The atomic coordinates and structure factors (code 1QLI) have been deposited in the Protein Data Bank, Brookhaven National Laboratory, Upton, NY.
We thank Friedrich Lottspeich (Max-Planck-Institute of Biochemistry, Martinsried, Germany) for protein sequencing, Klaus Kleboth (Institute of Analytical Chemistry and Radiochemistry, University of Innsbruck) for providing atomic absorption spectroscopy data, Karl-Hans Ongania for mass spectrometry, Gerald Färber for help with the molecular modelling, Georg Kontaxis and Karen Zierler-Gould for helpful discussions, and Sabine Weiskirchen for excellent technical assistance. R. K. thanks Lewis E. Kay (Department of Medical Genetics and Microbiology, University of Toronto, Canada) for providing pulse sequences, software, helpful discussions, and inspiring conversations.