Structural Analysis of Bacillus subtilis SPP1 Phage Helicase Loader Protein G39P*

Scott BaileyDagger §, Svetlana E. SedelnikovaDagger , Pablo Mesa, Sylvia Ayora||, Jon P. WalthoDagger **, Alison E. AshcroftDagger Dagger , Andrew J. BaronDagger Dagger , Juan C. Alonso, and John B. RaffertyDagger §§

From the Dagger  Krebs Institute for Biomolecular Research, Department of Molecular Biology and Biotechnology, University of Sheffield, Western Bank, Sheffield S10 2TN, United Kingdom,  Departamento de Biotecnología Microbiana, Centro Nacional de Biotecnologia, CSIC, Campus Universidad Autonoma de Madrid, Cantoblanco, 28049 Madrid, || Departamento de Biología Molecular, Universidad Autónoma de Madrid, 28049 Madrid, Spain, and Dagger Dagger  Astbury Centre for Structural Molecular Biology, School of Biology, University of Leeds, Leeds LS2 9JT, United Kingdom

Received for publication, September 11, 2002, and in revised form, January 21, 2003

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

The Bacillus subtilis SPP1 phage-encoded protein G39P is a loader and inhibitor of the phage G40P replicative helicase involved in the initiation of DNA replication. We have carried out a full x-ray crystallographic and preliminary NMR analysis of G39P and functional studies of the protein, including assays for helicase binding by a number of truncated mutant forms, in an effort to improve our understanding of how it both interacts with the helicase and with the phage replisome organizer, G38P. Our structural analyses reveal that G39P has a completely unexpected bipartite structure comprising a folded N-terminal domain and an essentially unfolded C-terminal domain. Although G39P has been shown to bind its G40P target with a 6:6 stoichiometry, our crystal structure and other biophysical characterization data reveal that the protein probably exists predominantly as a monomer in solution. The G39P protein is proteolytically sensitive, and our binding assays show that the C-terminal domain is essential for helicase interaction and that removal of just the 14 C-terminal residues abolishes interaction with the helicase in vitro. We propose a number of possible scenarios in which the flexibility of the C-terminal domain of G39P and its proteolytic sensitivity may have important roles for the function of G39P in vivo that are consistent with other data on SPP1 phage DNA replication.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

The initiation of DNA replication is a key step in the life cycle of all cells and as such its careful and precise control is essential. Studies of prokaryotic systems centered mainly on Escherichia coli, and its extrachromosomal elements have identified the following three key stages involved in DNA replication initiation: first, the recognition of the DNA replication origin and initial melting of the DNA strands; second, the recruitment of the replication machinery to the origin; and third, the remodeling of a replication complex to trigger the transition from a stable origin-bound complex to a mobile replication machine (1).

Initiation of DNA replication in E. coli proceeds via a sequence of events involving a replicon-specific recognition of oriC by DnaA and the loading of the replicative helicase DnaB by DnaC following the local melting of the DNA in an A + T-rich region (1-3). Initiation of theta -type DNA replication in many different extrachromosomal elements follows a similar central scheme but can differ in the requirements for host-encoded components and their remodeling (see Ref. 3). The most well characterized example of initiation of DNA replication within a Gram-positive bacterial environment is that of the Bacillus subtilis phage SPP1. Initiation of SPP1 replication requires the phage-encoded products of genes 38, 39, and 40 (G38P,1 G39P, and G40P) in addition to the host DNA polymerase III and DnaG primase (4). G38P (a monomer with a predicted molecular mass of 29,997 Da) acts as a close functional equivalent to DnaA (although the proteins share no sequence similarity) and is the replisome organizer of the SPP1 system. The G38P protein specifically interacts with its cognate site present in multiple copies at the phage replication origins (oriL and oriR). This interaction occurs in the absence of ATP and is thought to induce the local unwinding of the adjacent A + T-rich sequence present within oriL to initiate theta -type DNA replication (5-7). G40P is a DnaB-like helicase and as such is a ring-shaped hexamer, capable of unwinding duplex DNA with a 5' to 3' polarity in a reaction fueled by nucleotide 5'-triphosphate hydrolysis (6, 8). G39P is predominantly a monomer (molecular mass 14,610 Da) when free in solution and forms a specific interaction with ATP-activated G40P that inactivates the ssDNA binding, ATPase, and unwinding activities of the helicase. Targeting of G40P by G39P to the G38P-bound oriL then functions to activate G40P upon delivery (6). It is believed that G40P, in the form of the G39P-G40P-ATP complex, is delivered to G38P-bound oriL via the specific protein-protein interaction of the helicase-bound G39P with the G38P bound at oriL. These interactions result in the formation of an unstable nucleoprotein oriL-G38P·G39P·G40P-ATP intermediate, with subsequent release of G38P/G39P heterodimers that leaves the ATP-activated G40P complex to bind the melted origin region (6). Uncomplexed G38P remains bound to oriL, and the G40P helicase is free to interact with DnaG and the tau  subunit of both DNA polymerases and begin DNA unwinding (6, 9, 10). The action of G39P protein is similar to that of the bacteriophage lambda  gene P helicase-loading protein, but P requires an elaborate remodeling for freeing the helicase from the helicase/helicase-loader complex (1, 3). Furthermore, the action of G39P protein is quite distinct from that of the bacteriophage T4 gene 59 helicase-loading protein whose functions combine those of G38P and G39P as well as having a role in both replication and recombination (11, 12).

As the G39P protein exists in a variety of oligomeric forms including monomers, hetero-oligomers with G40P (probably in a 6:6 ratio of G39P:G40P-ATP), and heterodimers with G38P, the protein must possess a fold capable of allowing it to form a variety of specific interactions depending upon its local environment and the exact stage of the DNA replication process (4, 6). We have carried out a structural investigation of G39P to get a better understanding of the way in which G39P can interact specifically with both G38P and G40P and thus act as a key component of the system. Thus we present the first crystal structure and a preliminary NMR analysis of G39P alongside functional analyses of deletion mutants. It is the first structure of a prokaryotic helicase loader protein involved in theta -type DNA replication that also functions as an inhibitor of helicase function.

    EXPERIMENTAL PROCEDURES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Crystal Structure Determination-- The details of wt G39P and G39P112 variant constructs, their overexpression, purification, and crystallization are described elsewhere.2 Following purification, the wt protein was crystallized from ammonium phosphate, and preliminary x-ray diffraction data were collected at room temperature on home laboratory x-ray sources. These data suggested that the crystals belong to space group P6122 (or P6522) with cell dimensions a = b = 105.3 Å, c = 47.4 Å, and diffract to ~3.5 Å. However, the reproducibility of these crystals was very poor, and an analysis by mass spectrometry of freshly purified protein and the crystals themselves revealed multiple fragments derived from proteolytic cleavage at the C terminus of the protein. The proteolysis terminated markedly after the final 14 residues of G39P had been removed, and thus gene 39 was engineered to produce a mutant construct that coded for a truncated protein comprising only the first 112 of 126 residues of G39P. The G39P112 protein and selenomethionine incorporated form were prepared in a similar manner to wild type protein but crystallized from ammonium sulfate in space group P212121 with cell dimensions a = 85.6 Å, b = 89.7 Å, c = 47.6 Å. The crystals have three monomers in the asymmetric unit and an assumed solvent content of 47% based on a VM (Matthews coefficient) of 2.3 Å3 Da-1. The SeMet G39P112 crystals were used in a subsequent multiwavelength anomalous dispersion experiment in which data were collected using a Mar Research 345 imaging plate scanner at the European Synchrotron Radiation Facility on station BM30 using inverse beam geometry to collect Friedel pairs. The data for each wavelength were processed individually and scaled in such a way as to preserve anomalous signal using the HKL Suite of programs (13). The data processing statistics are shown in Table I. The positions of six selenium atoms were found using the program SOLVE (14). These positions were then refined, and initial phases were calculated in the program MLPHARE (15) following the pseudo-MIR procedure (16). Phasing statistics are shown in Table I.


                              
View this table:
[in this window]
[in a new window]
 
Table I
Data collection, phasing, and refinement statistics
The values in parentheses refer to the outer shell.

An electron density map was calculated at 3.0 Å and subsequently improved by solvent flattening and histogram matching with the program DM (17). This map was of good quality with readily identifiable regions of secondary structure. Following a preliminary trace of the secondary structure, non-crystallographic symmetry operators were determined for the three monomers found in the asymmetric unit, and the map was averaged and phase-extended to 2.4 Å using DM. The model fitted to the resultant map was submitted to refinement using the program REFMAC (18). Iterative cycles of phase combination of the partial structure phases and those from the multiwavelength anomalous dispersion experiment, model building, and refinement, which in the latter stages was performed using individual isotropic B-factors, translation, libration, and screw tensor parameterization (19), and loose non-crystallographic symmetry restraints were used to construct a model with good stereochemistry that accounted for residues 1-67 in each subunit. Maps that had not been solvent-flattened nor had non-crystallographic symmetry operators applied were examined to check for possible errors in the assignment of solvent boundary and accidental protein density flattening or for use of inappropriate restraints, but there was no indication that this had happened. The positions of the six selenium atoms correlated with the locations of the methionine residues in the N-terminal portions of each monomer. The refinement statistics are presented in Table I.

NMR Analysis-- Protein samples at concentrations of 1-2 mM in 20 mM phosphate buffer, pH 6.5, and temperatures ranging from 25 to 55 °C were used. Both one- and two-dimensional 1H1H experiments were recorded as described previously (20), using a Bruker DRX 500 spectrometer. Data were processed using FELIX (Molecular Simulations Inc.).

Gel Filtration and Analytical Ultracentrifugation Analyses Studies-- The G39P protein used in the gel filtration experiments was prepared and analyzed as described elsewhere (6).

For protein cross-linking, pure G39P (6 µM) was prepared in a phosphate buffer (PO4H2Na/PO4HNa2, pH 7.5, 0.5 mM dithiothreitol, 5% glycerol) containing 50 mM NaCl and then incubated in the presence or absence of glutaraldehyde (0-0.1%) for 30 min at room temperature. The reactions were stopped by addition of stop buffer (50 mM Tris-HCl, pH 7.5, 400 mM glycine, 3% 2-mercaptoethanol, 2% SDS, 10% glycerol) and loaded onto a 15% SDS-PAGE gel.

In the analytical ultracentrifugation sedimentation velocity analysis, 0.42-ml samples of protein at 1 mg ml-1 were centrifuged in 1.20-cm path length, two-sector aluminum centerpiece cells with sapphire windows in a four-place An-60 Ti analytical rotor running in a Beckman Optima XL-I analytical ultracentrifuge at 50,000 rpm at 16 °C. Changes in solute concentration were detected by Rayleigh interference and 280 nm absorbance scans. Results were analyzed by g(s*) analysis (21) using the program DCDT+ version 1.13 (22).

Limited Proteolysis Assays-- For proteolytic studies, pure wtG39P or N-terminal His-tagged variants (6 µM) were prepared in phosphate buffer (PO4H2Na/PO4HNa2, pH 7.5, 0.5 mM dithiothreitol, 5% glycerol) containing 50 mM NaCl and 1 mM phenylmethylsulfonyl fluoride and then incubated with proteinase K (62 ng/reaction) for increasing time intervals (0.5, 1, 2, 5, and 10 min) at 37 °C. An aliquot of the mixture was then removed, and the reaction was stopped by addition of stop buffer (50 mM Tris-HCl, pH 7.5, 400 mM glycine, 3% 2-mercaptoethanol, 2% SDS, 10% glycerol), before the products were loaded onto a 15% SDS-PAGE gel. The signal was quantified using a PhosphorImager. The 1-min proteinase K incubation reaction mixture was dialyzed against water and subjected to matrix-assisted laser desorption ionization/time of flight mass spectrometry. The N-terminal His-tagged G39P variant was incubated with proteinase K (62 ng/reaction) for 1 min at 37 °C. The reaction mixture was loaded onto a Ni-NTA column, and the column was washed in phosphate buffer containing 5 mM imidazole before elution with phosphate buffer containing 250 mM imidazole and subsequent analysis of the eluant on a 15% SDS-PAGE gel.

Deletion Mutant Assays-- The B. subtilis SPP1 wt phage was routinely propagated in B. subtilis strain YB886 (supo) and the conditional lethal mutants SPP1sus53 and SPP1sus22 in BG295 (sup3) strain. Phage stocks had titers of 1.0-5.0 × 1010 plaque-forming units/ml when plated under permissive conditions. Reversion frequencies were not higher than 10-5. SPP1 wt, SPP1sus53, and SPP1sus22 were used to infect B. subtilis YB886 cells bearing plasmid-borne gene 39 or gene 39-112, and manipulations followed the standard procedures described for SPP1 (23).

For the affinity chromatography assay, gene 39 mutants were constructed that encoded for His-tagged, truncated variants of the wt protein. In separate experiments, each protein variant was loaded onto a Ni-NTA-agarose column (2 µg of protein per 20 µl matrix). G40P (2 µg) was then loaded onto the column, and the binding ability of the G39P variant was confirmed by elution using imidazole (250 mM) followed by SDS-PAGE analysis.

    RESULTS
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Crystal Structure of G39P-- The x-ray crystallographic analysis of G39P has revealed a completely unexpected bipartite structure for the protein that is made even more striking given its comparatively small size (126 residues in the wt protein). In the final model fitted to a map at 2.4-Å resolution, residues 1-67 for each of the three copies of the protein in the a.u. were present, and there was a total of 41 solvent molecules. There was no interpretable electron density for residues 68-112 at the C terminus of each subunit, and the N-terminal domain was sufficient to make all the necessary crystal packing contacts. The final model R-factor is 0.20 with a corresponding value for Rfree of 0.23 and strongly supports the proposal that all of the ordered scattering matter has been reasonably modeled at this resolution.

Thus the non-complexed G39P protein in vitro would appear to consist of two distinct domains as follows: a fully folded 67-residue N-terminal domain, and a C-terminal domain that has only limited fold (see NMR analysis below). Each of the three copies of the G39P monomer in the a.u. has essentially the same fold for the N-terminal domain that is composed of four helices. There are three alpha -helices as follows: alpha A (residues 3-16); alpha B (residues 26-39); alpha C (residues 42-55), plus there is a very short 310 helix D (residues 62-65) comprising little more than one turn. The helices can be described as two approximately parallel pairs (alpha A/alpha C and alpha B/D) that cross at an angle of about 70° (Fig. 1). The structure of the bacteriophage T4 gene 59 helicase-loading protein also has two alpha -helical domains, but its N-terminal domain shows a strong structural similarity to the high mobility group family proteins (11), which is not seen in the N-terminal domain of G39P.


View larger version (23K):
[in this window]
[in a new window]
 
Fig. 1.   The sequence and crystal structure of G39P. A, the amino acid sequence of G39P with the secondary structure, the extent of the ordered N-terminal domain observed in the crystal structure, and the site of the C-terminal truncation are indicated, black-triangle. B, a portion of the refined protein model fitted to the final 2Fo - Fc map, shown in stereo. C, a stereo Calpha -backbone trace of the protein with every 10th residue marked. D, a ribbon trace of a monomer with the secondary structure elements labeled. Figure was produced using the programs ALSCRIPT (35), BOBSCRIPT (36), MOLSCRIPT (37), and Raster3D (38).

The G39P112 protein crystallized with three independent monomers in the a.u., but they adopted an arrangement around a non-crystallographic 61 screw axis parallel to the crystallographic c axis (Fig. 2). When viewed along the direction of the c axis, the disordered C-terminal domains are located on the exterior of the helical arrays of monomers formed by the screw axes. The cavities in the crystal lattice are clearly sufficient to accommodate the C-terminal domains of the G39P112 variant as supported by mass spectrometric analysis of the crystal used in the structure determination that confirmed the presence of intact variant (data not shown). However, it would seem that the extra bulk provided by the 14 C-terminal residues of the intact wt protein necessitates an alternative crystal packing arrangement that appears to be less stable as judged by the noticeably poorer crystal reproducibility and diffraction quality. The inter-monomer contacts are made predominantly between residues immediately following helices alpha A and alpha C and those preceding helices alpha B and D. The residues involved are both polar and hydrophobic, and the interface includes two completely buried water molecules. Pairwise superposition of the alpha -carbon positions of each of the monomers gives root mean square deviation values of 0.2 to 0.3 Å. Approximately 1200 Å2 or 30% of the surface area of each monomer is buried in the interfaces with other monomers in the non-crystallographic 61 helix and a further ~10% on average in contacts with other subunits in the crystal lattice. The missing polypeptide chain in the electron density map, extending from residue Lys-67, is directed toward a large cavity in the crystalline lattice (Fig. 2) where it adopts a flexible, mainly unfolded state (see NMR analysis below). Thus at least in the crystal lattice, the G39P112 protein appears to exist as a monomer.


View larger version (41K):
[in this window]
[in a new window]
 
Fig. 2.   The arrangement of monomers of G39P112 in the crystal lattice and their electrostatic surface potential. A, ribbon trace of the three monomers in the a.u. labeled 1-3 and shown in red, green, and blue. B, G39P112 monomers shown as ribbon trace and equivalent surface colored for electrostatic potential (red <= -10 kcal (mol·e)-1 and blue >= 10 kcal (mol·e)-1). View is orthogonal to the non-crystallographic 61 screw axis and shows two copies of the a.u. (monomers labeled 1-3 and 1*, 2*, 3*). C, the packing of adjacent helical repeats of G39P112 monomers in the crystal lattice as viewed along the non-crystallographic 61 screw axes, with the final residues of the N-terminal domains marked C, indicating the locations of the C-terminal domains. Figure was produced using the programs GRASP (39), MOLSCRIPT (37), and Raster3D (38).

Calculations of the electrostatic surface potential of the folded N-terminal domain of G39P reveals a somewhat negatively charged surface overall. However, there is a notable localized, highly negative patch on the surface formed by residues at the N terminus of helix alpha A and the N terminus and loop preceding helix alpha C that also lies adjacent to the last observed residue in the map, Lys-67. The distribution of charge is even more striking when one examines the helical packing of the G39P monomers in the crystal that reveals the helical array to have a very predominantly negatively charged outer surface with the uncharged or positive surface mostly buried in inter-monomer contacts or close to the helical axis (Fig. 2). The unobserved, flexible C-terminal domain may modify the apparent surface charge, but the calculated pI is 4.9 for the C-terminal 59 residues of G39P (the calculated pI for the N-terminal 67 residues is 5.2) and thus might suggest a generally negatively charged surface. Apart from the slight imbalance in positively and negatively charged residues that leads to the acidic pI, the C-terminal domain of G39P does not show a particularly abnormal distribution of residue type.

Molecular replacement attempts to determine the structure of the wt G39P protein using the lower resolution data collected to 3.5 Å from the seemingly related P6122/P6522 crystal form are ongoing but have so far been unsuccessful.

Analysis of Internal Mobility-- In order to determine to what extent the disorder observed for the C-terminal half of the molecule reflected conformational heterogeneity in solution rather than disorder within the crystal, the 1H-NMR behavior of G39P was investigated. One- and two-dimensional 1H1H-TOCSY experiments were recorded on samples of both wt G39P and G39P112 mutant protein forms (Fig. 3). Spectra recorded at room temperature for both forms of the protein revealed two domains with very different degrees of motion, as revealed by differential NMR relaxation rates. For the C-terminal ~50 residues, i.e. about half the size of G39P, NMR relaxation rates are slow and are thus dominated by internal mobility far in excess of the overall rotation of the protein. Resonances from these residues show little chemical shift dispersion away from their random coil values, indicative of conformational averaging, and high intensity cross-peaks in TOCSY spectra, as illustrated by the correlations between the aromatic ring protons of Phe-76 and Tyr-80 in Fig. 3B. The intensity of the primary amide cross-peaks of glutamines 84, 90, 104, and 107 and asparagines 99 and 110 is also apparent in Fig. 3B. Backbone amide proton resonances from the mobile C-terminal domain, upon heating the sample, are severely attenuated in intensity by solvent exchange, following saturation of the water resonance (Fig. 3A). This demonstrates that there is weak or no hydrogen bonding involving this part of the protein backbone other than to solvent molecules.


View larger version (20K):
[in this window]
[in a new window]
 
Fig. 3.   NMR analysis of G39P. A, the one-dimensional 1H NMR spectrum of the G39P112 mutant protein is shown over a series of temperatures. Resonances between 7.6 and 8.4 ppm principally arise from the backbone amide protons of the mobile C-terminal domain. Their reduction in intensity on increasing temperature reveals their lability to exchange with saturated solvent protons. The resonances between 8.4 and 10.2 ppm represent corresponding protons (and the indole NH proton of Trp-33 at 10.1 ppm) from the immobile N-domain. Their increase in intensity with temperature reveals the thermal lability of a self-association process involving this domain. B, the region of a two-dimensional 1H1H-TOCSY spectrum (mixing time 30 ms) containing correlations between the protons of the aromatic rings of tyrosine and phenylalanine, illustrating the very high intensity of the cross-peaks of the corresponding residues from the mobile C-terminal domain, namely Phe-76 and Tyr-80. The relatively high intensities of cross-peaks between primary amide protons of residues in the mobile C-terminal domain can also be seen (upper left and lower right).

The resonances from the N-terminal residues, at room temperature, have far lower intensity than would be expected for a protein of around 13-15 kDa, indicative of a self-association process under the conditions of the NMR experiments. Upon heating, resonances from this region sharpen markedly, indicating the thermal dissociation of the aggregate (Fig. 3A). In contrast to the resonances corresponding to the mobile region, these resonances display the chemical shift dispersion of a normally folded domain and include, for example, the easily identifiable indole NH of Trp-33 located at the N terminus of helix alpha B and found within the hydrophobic core of the crystal structure of the N-terminal domain.

The NMR experiments indicate that in solution the protein behaves as a two-domain entity. One domain, corresponding to approximately the 67 N-terminal residues observed in the electron density map from the x-ray experiment, exists in the fold determined above, although at room temperature it is involved in a thermally labile, self-association process. The second domain, corresponding to the remaining 60 C-terminal residues that are not observed in the electron density map, has rapid internal motion and no well defined and stable fold involving immobilized side chains. A detailed comparison of the spectra from the intact wt and truncated forms of G39P revealed no major difference between the two forms. Our findings strongly support the idea that the disorder observed in the crystal structure for the C-terminal region was not a result of the truncation of the protein nor was it merely some form of crystal artifact but reflected an underlying flexibility that may be closely related to the function of the protein.

Analysis of G39P Oligomeric State-- In order to examine further the apparent difference between the oligomeric state of G39P as observed in the crystal and that reported previously in solution (6), the wt protein and the G39P112 mutant were subjected both to gel filtration and analytical ultracentrifugation analyses.

The gel filtration studies suggested that under the conditions tested (40 mM Tris-HCl, pH 8.0, containing 100 mM NaCl at 4 and 25 °C) both the wt G39P and the G39P112 mutant apparently exist largely as a dimer when in the micromolar concentration range (Fig. 4) and as an equilibrium between dimer and monomer when in the nanomolar concentration range (data not shown). However, the NMR data above reveals a rapid equilibrium in solution between dissociated and aggregated states of the protein and emphasizes the need for a more cautious interpretation of the results of gel filtration experiments that are necessarily carried out over much longer time scales. The observation of a species approximating to the size of a dimer might actually arise from the rapid interchange between the monomeric and aggregated forms of the protein and is further complicated by an increase in the hydrodynamic radius arising from the flexible C-terminal domain. Gel filtration of G39P samples subjected to protein cross-linking in the micromolar concentration range apparently revealed monomers, dimers, trimers, and higher order oligomers, but interpretation of these results carries the same caveats as described for the non-cross-linked gel filtration analysis with respect to the rapid equilibration between the aggregated states of the protein and its hydrodynamic radius.


View larger version (17K):
[in this window]
[in a new window]
 
Fig. 4.   Gel filtration studies of G39P. Superdex 200 HiLoad gel filtration column chromatograms are shown of wtG39P (A) and G39P112 (B). Experiments were performed in the micromolar protein concentration range. Molecular weight standards are shown, and the positions of their elution times are marked on the chromatograms with the relevant number in a red circle.

In the analytical ultracentrifugation experiments, samples of the wt G39P and G39P112 mutant proteins were subjected to sedimentation velocity measurements under similar solvent conditions to those used in the gel filtration experiment but at both acidic and basic pH values (10 mM BisTris-HCl, pH 6.0, or Tris-HCl, pH 8.0, 160 mM KCl) and a protein concentration in the micromolar range (Fig. 5). The average mass calculated from both absorbance and interference scans of the solute front was ~12.0 kDa for wt G39P and 12.4 kDa for G39P112, which fits well with a monomer form. However, the plots of the rate of change of the concentration (dc/dt) versus the sedimentation coefficient S look slightly irregular for the wt G39P sample at pH 8.0, and a better fit to a mixture of monomer and dimer can be made, and it is reasonable to conclude that there is an equilibrium between monomer and higher order oligomer forms favoring the monomeric species under these conditions.


View larger version (23K):
[in this window]
[in a new window]
 
Fig. 5.   Analytical ultracentrifugation of G39P. Plots of rate of change of the concentration (dc/dt) versus sedimentation coefficient S are shown. a, wtG39P at pH 6; b, wtG39P at pH 8; c, G39P112 at pH 6; d, G39P112 at pH 8. The experimentally derived plots are presented (black) as well as the calculated plots based on either a single species model for monomer (dark blue) or dimer (green), or a two species model based on an equilibrium between monomer and dimer (light blue).

Limited Proteolysis Study of G39P-- An investigation of the general susceptibility of G39P to proteolytic degradation was performed using proteinase K and both wtG39P and an N-terminal His-tagged variant. The analysis revealed a much more pronounced sensitivity to proteolytic cleavage in the C-terminal half of the protein that resulted in fragments corresponding to residues 1-79, 1-87, 1-90, 1-94, and 1-106 (Fig. 6). The identity of the fragments was confirmed by the correspondence of the molecular weights determined by mass spectrometry and by retention on a Ni-NTA column of equivalent fragments (as assessed by SDS-PAGE) from the N-terminal His-tagged variant.


View larger version (23K):
[in this window]
[in a new window]
 
Fig. 6.   Limited proteolysis of wtG39P and His-tagged G39P. The results of digestion of G39P with proteinase K are shown. a, SDS-PAGE analysis of the products of digestion over increasing time. Lanes 1-6 correspond to digestion of wtG39P for 0 (lane 1), 0.5 (lane 2), 1 (lane 3), 2 (lane 4), 5 (lane 5), and 10 min (lane 6). Lanes 7 and 9 correspond to digestion of N-terminal His-tagged G39P for 1 and 0 min, respectively. Lane 8 corresponds to eluant from a Ni-NTA column loaded with the products seen in lane 7, i.e. following a 1-min digestion. b, quantification of the results in lane 3 (1-min digestion of G39P). c, matrix-assisted laser desorption ionization/time of flight mass spectrometry analysis of G39P digested for 1 min. The peak of 7,306 Da corresponds to the double charge of full-length G39P. d, the residue numbers for the predicted cleavage sites for bands/peaks a-c are indicated by open arrowheads, and G39P112 variant truncated at residue 112 is indicated by a filled arrowhead. The N-terminal 67 residues seen in the crystal structure are denoted in dark gray.

Genetic and Biochemical Analysis of Deletion Mutants-- A series of SPP1 gene 39 mutants was studied in complementation assays, and their expressed protein products corresponding to fragments of G39P were examined in vitro to test their ability to interact with G40P-ATPgamma S and inhibit its helicase, ATPase, or ssDNA binding activities (6). The results of these studies can be analyzed in the light of the domain structure of G39P revealed by this study.

The gene 39 mutant, G39112, encoding the truncated protein corresponding to residues 1-112, was established in a plasmid-borne system, and this was used to test its ability to complement the defect of the SPP1sus53 conditional lethal mutant. The SPP1sus53 mutant allele has a suppressible mutation at the eighth codon of gene 39, and although a plasmid-borne wt 39 gene fully complements the defect of SPP1sus53, leading to a phage yield indistinguishable from the titer of wt SPP1, the plasmid-borne gene 39-112 mutant is unable to do so. This is consistent with the fact that an SPP1 conditional lethal mutant with a suppressible mutation at codon 103 of gene 39 (SPP1sus22) has been isolated previously (4) and suggests that G39P112 is inactive as a loader and/or inhibitor of the G40P helicase in vivo.

The full-length wt G39P protein is able to interact with G40P-ATPgamma S and to inhibit all three associated activities (ssDNA binding, ATPase, and helicase activity (6)). Assays have been performed on fragments of G39P and have shown that the G39P112 variant can neither exert a negative effect on G40P activities nor compete out the wt protein from the G39P-G40P-ATP complex (data not shown). A series of G39P truncated variants with N termini deleted up to residue 73 still show interaction with G40P-ATP as assessed by affinity chromatography assay, but a variant consisting of the N-terminal residues 1-68 does not (Fig. 7).


View larger version (19K):
[in this window]
[in a new window]
 
Fig. 7.   Binding of G40P by truncated His-tagged G39P variants. A, SDS-PAGE analysis of samples eluted from an Ni-NTA column following affinity chromatography (see "Experimental Procedures"). The individual lanes show the results of using different G39P constructs to capture G40P loaded on the column. Lane 1, non-tagged wt G39P. Lanes 2-7, N-terminal His-tagged proteins: lane 2, h39; lane 3, hDelta 39N14; lane 4, hDelta 39N22; lane 5, hDelta 39N55; lane 6, hDelta 39N73; lane 7, hDelta C69; and lane 8, C-terminal His-tagged protein Delta 39C69h. The numbers following the N or C indicate the extent of the truncation from the N and C termini, respectively, and the constructs are shown schematically in B.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Our structural analysis has shown that the G39P protein has two domains: a stably folded 67-residue N-terminal domain and a highly flexible and largely unfolded 59-residue C-terminal domain. This correlates well with our biochemical observations that suggest a bipartite nature for G39P in which the C-terminal domain has been implicated as the fragment of the protein responsible for the interaction with the G40P helicase. The difference in the folding behavior of the two domains of the protein is striking and unexpected and may also indicate some functional significance.

Multifunctional, multidomain proteins are common in biology, but examples as small as the G39P protein are more rare. There is an increasing body of evidence to suggest that "natively unfolded" proteins are quite common in vitro and possibly in vivo (24) and that some can adopt more structured forms only in the presence of partner or target molecules or other ligands (25). Many of these "natively unfolded" proteins have been implicated in disease states such as various forms of cancer, Alzheimer's and Parkinson's diseases, and myotonic dystrophy (24), and their unfolded nature has been linked often to their pathological effects. The presence of ordered domains coupled to other largely unfolded domains as observed for the N- and C-terminal domains in G39P also has a precedent in other structural studies, and in many of these cases a functional significance has been assigned to the flexibility of the domains (26-28). Studies on the Salmonella typhimurium regulatory protein FlgM suggest that this protein is intrinsically unstructured when in dilute solution in vitro (26). However, the C-terminal domain of FlgM is observed to adopt a more structured form either when its partner molecule, the RNA polymerase sigma -factor sigma 28, is added (26), when it is in vivo, or when in vitro conditions are adjusted to match more closely those found in living cells (29). Thus, it is possible that the C-terminal domain of G39P may possess more structure under in vivo conditions, even in the absence of partner molecules, than we have observed in our experiments. However, it might also be the case that the C-terminal domain of G39P is inherently flexible and has little structure when not in a complex to enable the optimal interaction of G39P with its G40P helicase partner. This interaction has an apparent 6:6 G39P:G40P stoichiometry (6), and the flexibility may be essential for the correct formation of a hexameric arrangement of G39P on the surface of G40P that is still accessible for interaction with origin bound G38P. Indeed G39P may need to bind potentially to a number of monomer forms of the G40P within the hexamer as these may vary depending upon either the relative conformations of the helicase subunits or the state of loading of ATP nucleotide or its hydrolyzed products as observed for the T7 gene 4 helicase (30). Use of a largely unfolded state to bind a variety of targets has been observed previously (31), and the cyclin-dependent kinase inhibitor, p21, has a completely unfolded native state that is suggested to enhance its ability to bind multiple protein targets. Another possible reason for maintaining a flexible C-terminal domain in G39P could be to enable inactivation of the protein by rapid proteolytic degradation (25). We have observed that removal of just the C-terminal 14 residues impairs G40P binding. Within the cell, random unregulated DNA binding and unwinding by G40P-ATP would be deleterious, and hence some control and targeting of its function is required through the combined action of G38P and G39P to ensure loading at the origin of replication. However, once replication has commenced and the replication machinery moved on from the origin, problems can arise from DNA damage, and the replication fork can stall with release of the replicative machinery and the requirement for subsequent reloading of these components after damage repair. Recently, it has been shown that the loading of G40P at any stalled replication fork by the SPP1 phage-encoded G35P protein can lead to replication fork reactivation (32). At this point, binding of G40P by G39P could be harmful, and thus the levels of G39P might need to be kept low either through its interaction with other factors such as G38P or by its degradation. Indeed, G39P accumulates very fast after phage infection and reaches a plateau at minute 5, remains constant up to minute 18, and goes to initial basal levels after minute 20, whereas levels of G40P accumulate with similar kinetics to those of G39P but remain constant until phage lysis (33). The reason for the apparently abrupt stop in the proteolysis of G39P after the removal of the 14 C-terminal residues under the experimental conditions used prior to crystallization is still under investigation as there is no obvious protease target site at this point. Our proteolytic degradation experiments reveal that the extent of protease sensitivity corresponds well with a more structured N-terminal domain and a less structured C-terminal domain, although the presence of substantial amounts of discrete fragments during the initial stages of proteolysis may argue for some limited structure in the C-terminal domain. Recent studies (34) on the structure of the E. coli protein DnaC that loads the replicative helicase DnaB also suggest that it is unusually flexible when free in solution. This prompts the proposal that a high level of structural flexibility might be a recurring theme in domains of loader proteins involved in theta -type replication that interact with replicative helicases.

The flexibility of the C-terminal domain may also be intrinsic to the ability of the protein to bind G38P and act as a linker in the transfer of the G40P helicase onto its ssDNA target. Upon transfer of the G40P-ATP to the DNA, the G39P dissociates as a heterodimer with G38P (6). The predicted pI of G38P is 9.0, and hence it is likely to have a positive electrostatic surface consistent with its DNA binding function, but this feature might also be important in the interaction with G39P given its overall negative surface charge distribution (see Fig. 2).

Our analyses of the oligomeric state of G39P when free in solution suggest that it is most likely in a monomeric form at the sub-micromolar concentrations found in vivo but that it can form higher order species or aggregates as the local concentration increases. The crystal structure implies that the oligomerization of the monomers probably does not proceed via the initial formation of 2-fold rotationally symmetric dimers but rather the gradual building up of larger species via growing chains of monomers that have 6-fold symmetry potential. Indeed the primary function of the N-terminal domain of G39P may be oligomerization for presentation of the C-terminal domain to partner proteins, but we are currently investigating further possible roles in the interaction with G38P.

Thus, this first crystal structure for a helicase loader/inhibitor protein involved in theta -type DNA replication has revealed an unexpected, highly plastic, bipartite structure that has developed to fulfill multiple interaction functions and to ensure the critical loading of the replicative DNA helicase on the DNA origin of replication.

    ACKNOWLEDGEMENTS

We thank Michel Roth and the station staff on BM30 at the European Synchrotron Radiation Facility and the staff at the Central Laboratory of the Research Councils Daresbury Synchrotron Radiation Source laboratory.

    FOOTNOTES

* This work was supported in part by European Commission Grants BIO4-CT98-0106 and QLK2-CT-2000-00634 and an EMBL grant under the "Human Capital and Mobility" Programme. The Krebs Institute is a designated Biotechnology and Biological Sciences Research Council Biomolecular Sciences Centre and a member of the North of England Structural Biology Centre.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The atomic coordinates and the structure factors (code 1NO1) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/).

§ Present address: Dept. of Molecular Biophysics and Biochemistry, Yale University, Bass Center, Rm. 415, 266 Whitney Ave., New Haven, CT 06520-8114.

** Lister Institute Research Fellow.

§§ Royal Society Olga Kennard Fellow. To whom correspondence should be addressed. Tel.: 44-114-2222809; Fax: 44-114-2728697; E-mail: j.rafferty@sheffield.ac.uk.

Published, JBC Papers in Press, February 13, 2003, DOI 10.1074/jbc.M209300200

2 S. Bailey, S. E. Sedelnikova, P. Mesa, S. Ayora, J. C. Alonso, and J. B. Rafferty, submitted for publication.

    ABBREVIATIONS

The abbreviations used are: GXP, gene X product; a.u., asymmetric unit; ss, single-stranded; wt, wild type; BisTris, 2-[bis(2-hydroxyethyl)amino]-2-(hydroxymethyl)propane-1,3-diol; Ni-NTA, nickel-nitrilotriacetic acid; ATPgamma S, adenosine 5'-O-(thiotriphosphate).

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

1. Baker, T. A., and Bell, S. P. (1998) Cell 92, 295-305[Medline] [Order article via Infotrieve]
2. Marians, K. J. (1992) Annu. Rev. Biochem. 61, 673-719[CrossRef][Medline] [Order article via Infotrieve]
3. Lee, D. G., and Bell, S. P. (2000) Curr. Opin. Cell Biol. 12, 280-285[CrossRef][Medline] [Order article via Infotrieve]
4. Pedré, X., Weise, F., Chai, S., Lüder, G., and Alonso, J. C. (1994) J. Mol. Biol. 236, 1324-1340[Medline] [Order article via Infotrieve]
5. Missich, R., Weise, F., Chai, S., Pedré, X., Lurz, R., and Alonso, J. C. (1997) J. Mol. Biol. 270, 50-64[CrossRef][Medline] [Order article via Infotrieve]
6. Ayora, S., Stasiak, A., and Alonso, J. C. (1999) J. Mol. Biol. 288, 71-85[CrossRef][Medline] [Order article via Infotrieve]
7. Ayora, S., Mesa, P., Weise, F., Stasiak, A., and Alonso, J. C. (2002) Nucleic Acids Res. 30, 2280-2289[Abstract/Free Full Text]
8. Bárcena, M., San Martin, C., Weise, F., Ayora, S., Alonso, J. C., and Carazo, J. M. (1998) J. Mol. Biol. 283, 809-819[CrossRef][Medline] [Order article via Infotrieve]
9. Ayora, S., Langer, U., and Alonso, J. C. (1998) FEBS Lett. 439, 59-62[CrossRef][Medline] [Order article via Infotrieve]
10. Martinez-Jimenez, M. I., Mesa, P., and Alonso, J. C. (2002) Nucleic Acids Res. 30, 5056-5064[Abstract/Free Full Text]
11. Jones, C. E., Mueser, T. C., Dudas, K. C., Kreuzer, K. N., and Nossal, N. G. (2001) Proc. Natl. Acad. Sci. U. S. A. 98, 8312-8318[Abstract/Free Full Text]
12. Ismael, F. T., Alley, S. C., and Benkovic, S. J. (2002) J. Biol. Chem. 277, 20555-20562[Abstract/Free Full Text]
13. Otwinowski, Z., and Minor, W. (1997) Methods Enzymol. 276, 307-326
14. Terwilliger, T. C., and Berendzen, J. (1999) Acta Crystallogr. Sect. D Biol. Crystallogr. 55, 849-861[CrossRef][Medline] [Order article via Infotrieve]
15. Otwinowski, Z. (1991) in Proceedings of the CCP4 Study Weekend, Warrinton, January 25-26, 1991 (Wolf, W. , Evans, P. R. , and Leslie, A. G. W., eds) , pp. 80-88, SERC Daresbury Laboratory, Warrington, UK
16. Ramakrishnan, V., and Biou, V. (1997) Methods Enzymol. 276, 538-557[Medline] [Order article via Infotrieve]
17. Cowtan, K. (1994) Joint CCP4 and ESF-EACBM Newsletter on Protein Crystallography 31, 34-38
18. Murshudov, G. N., Vagin, A. A., and Dodson, E. J. (1997) Acta Crystallogr. Sect. D Biol. Crystallogr. 53, 240-255[CrossRef][Medline] [Order article via Infotrieve]
19. Winn, M. D., Isupov, M., and Murshudov, G. N. (2001) Acta Crystallogr. Sect. D Biol. Crystallogr. 57, 122-133[CrossRef][Medline] [Order article via Infotrieve]
20. Martin, J. R., Jerala, R., Kroonzitko, L., Zerovnik, E., Turk, V., and Waltho, J. P. (1994) Eur. J. Biochem. 225, 1181-1194[Abstract]
21. Stafford, W. F. (1992) Anal. Biochem. 203, 295-301[Medline] [Order article via Infotrieve]
22. Philo, J. S. (2000) Anal. Biochem. 279, 151-163[CrossRef][Medline] [Order article via Infotrieve]
23. Chai, S., Szepan, U., Lüder, G., Trautner, T. A., and Alonso, J. C. (1993) Gene (Amst.) 129, 41-49[CrossRef][Medline] [Order article via Infotrieve]
24. Uversky, V. N. (2002) Eur. J. Biochem. 269, 2-12[Abstract/Free Full Text]
25. Dyson, H. J., and Wright, P. E. (2002) Curr. Opin. Struct. Biol. 12, 54-60[CrossRef][Medline] [Order article via Infotrieve]
26. Daughdrill, G. W., Chadsey, M. S., Karlinsey, J. E., Hughes, K. T., and Dahlquist, F. W. (1997) Nat. Struct. Biol. 4, 285-291[Medline] [Order article via Infotrieve]
27. Wright, P. E., and Dyson, H. J. (1999) J. Mol. Biol. 293, 321-331[CrossRef][Medline] [Order article via Infotrieve]
28. Dunker, A. K., Brown, C. J., Lawson, J. D., Iakoucheva, L. M., and Obradovic, Z. (2002) Biochemistry 41, 6573-6582[CrossRef][Medline] [Order article via Infotrieve]
29. Dedmon, M. M., Patel, C. N., Young, G. B., and Pielak, G. J. (2002) Proc. Natl. Acad. Sci. U. S. A. 99, 12681-12684[Abstract/Free Full Text]
30. Singleton, M. R., Sawaya, T. E., Ellenberger, T., and Wigley, D. B. (2000) Cell 101, 589-600[Medline] [Order article via Infotrieve]
31. Kriwacki, R. W., Hengst, T. L., Reed, S. I., and Wright, P. E. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 11504-11509[Abstract/Free Full Text]
32. Ayora, S., Missich, R., Mesa, P., Lurz, R., Yang, X., Egelman, E. H., and Alonso, J. C. (2002) J. Biol. Chem. 277, 35969-35979[Abstract/Free Full Text]
33. Weise, F. (1997) Biochemische Charakterisierung der Initiation der DNA-Replikation des Bacillus subtilis-Bacteriophagen SPP1.Ph.D. thesis , Freie Universität, Berlin, Germany
34. Bárcena, M., Ruiz, T., Donate, L. E., Brown, S. E., Dixon, N. E., Radermacher, M., and Carazo, J. M. (2001) EMBO J. 20, 1462-1468[Abstract/Free Full Text]
35. Barton, G. L. (1993) Protein Eng. 6, 37-40[Medline] [Order article via Infotrieve]
36. Esnouf, R. M. (1999) Acta Crystallogr. Sect D Biol. Crystallogr. 55, 938-940[CrossRef][Medline] [Order article via Infotrieve]
37. Kraulis, P. J. (1991) J. Appl. Crystallogr. 24, 946-950[CrossRef]
38. Merritt, E. A., and Bacon, D. J. (1997) Methods Enzymol. 277, 505-524
39. Nicholls, A., Sharp, K. A., and Honig, B. (1991) Proteins, Structure, Function, and Genetics 11, 281-296
40. Laskowski, R. A., MacArthur, M. W., Moss, D. S., and Thornton, J. M. (1993) J. Appl. Crystallogr. 26, 283-291[CrossRef]


Copyright © 2003 by The American Society for Biochemistry and Molecular Biology, Inc.



This Article
Abstract
Full Text (PDF)
All Versions of this Article:
278/17/15304    most recent
M209300200v1
Purchase Article
View Shopping Cart
Alert me when this article is cited
Alert me if a correction is posted
Citation Map
Services
Email this article to a friend
Similar articles in this journal
Similar articles in PubMed
Alert me to new issues of the journal
Download to citation manager
Copyright Permissions
Google Scholar
Articles by Bailey, S.
Articles by Rafferty, J. B.
Articles citing this Article
PubMed
PubMed Citation
Articles by Bailey, S.
Articles by Rafferty, J. B.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 All ASBMB Journals   Molecular and Cellular Proteomics 
 Journal of Lipid Research   Biochemistry and Molecular Biology Education 
Copyright © 2003 by the American Society for Biochemistry and Molecular Biology.