A Conserved Structural Motif at the N Terminus of Bacterial Translation Initiation Factor IF2*

Brian Søgaard LaursenDagger §, Kim Kusk MortensenDagger , Hans Uffe Sperling-PetersenDagger , and David W. Hoffman||**

From the Dagger  Department of Molecular Biology, University of Aarhus, DK8000 Aarhus, Denmark and the || Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, University of Texas, Austin, Texas 78712

Received for publication, December 19, 2002, and in revised form, February 4, 2003

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

The 18-kDa Domain I from the N-terminal region of translation initiation factor IF2 from Escherichia coli was expressed, purified, and structurally characterized using multidimensional NMR methods. Residues 2-50 were found to form a compact subdomain containing three short beta -strands and three alpha -helices, folded to form a beta alpha alpha beta beta alpha motif with the three helices packed on the same side of a small twisted beta -sheet. The hydrophobic amino acids in the core of the subdomain are conserved in a wide range of species, indicating that a similarly structured motif is present at the N terminus of IF2 in many of the bacteria. External to the compact 50-amino acid subdomain, residues 51-97 are less conserved and do not appear to form a regular structure, whereas residues 98-157 form a helix containing a repetitive sequence of mostly hydrophilic amino acids. Nitrogen-15 relaxation rate measurements provide evidence that the first 50 residues form a well ordered subdomain, whereas other regions of Domain I are significantly more mobile. The compact subdomain at the N terminus of IF2 shows structural homology to the tRNA anticodon stem contact fold domains of the methionyl-tRNA and glutaminyl-tRNA synthetases, and a similar fold is also found in the B5 domain of the phenylalanine-tRNA synthetase. The results of the present work will provide guidance for the design of future experiments directed toward understanding the functional roles of this widely conserved structural domain within IF2.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

The initiation step of protein biosynthesis is rate-limiting and hence an important point of regulation. In bacteria, translation initiation is promoted by three protein factors: IF1,1 IF2, and IF3. These protein factors are essential in ultimately assembling the 30 S and 50 S subunits of the ribosome, the initiator fMet-tRNA<UP><SUB>f</SUB><SUP>Met</SUP></UP>, and the translation initiation region of the mRNA, thereby forming the functional 70 S initiation complex. Bacterial translation initiation factor IF2 ensures correct binding of fMet-tRNA<UP><SUB>f</SUB><SUP>Met</SUP></UP> to the P-site of the 30 S ribosomal subunit. Subsequently, the 50 S ribosomal subunit joins the 30 S initiation complex aided by IF2. GTP bound to the factor is hydrolyzed in a ribosome-dependent reaction. IF2 is then released from the ribosome leaving fMet-tRNA<UP><SUB>f</SUB><SUP>Met</SUP></UP> in the ribosomal P-site. Recent reviews of the translation initiation process are provided in Refs. 1-4.

The primary structure of initiation factor IF2 from different organisms can be divided into distinct regions based on interspecies amino acid sequence homology (5), as shown in Fig. 1. The C-terminal region of the protein is highly conserved among species. This part has several functions including a binding site for fMet-tRNA<UP><SUB>f</SUB><SUP>Met</SUP></UP>, a site for binding and hydrolysis of GTP, and a site for interaction with IF1 (Ref. 6 and references therein). Sequence homologues of IF2 have been found in Archaea and eukaryotes, where the factor is referred to as aIF5B and eIF5B, respectively (7). aIF5B and eIF5B have some functional similarity to bacterial IF2; each of these proteins has GTPase activity, and promotes ribosomal subunit joining and probably interaction with Met-tRNA<UP><SUB>i</SUB><SUP>Met</SUP></UP> (8, 9). The crystal structure of archaeal aIF5B from Methanobacterium thermoautothrophicum has been solved (10); sequence homology predicts a similar structure for the conserved C terminus of IF2 from bacteria. The structure of the 99-residue Met-tRNA<UP><SUB>i</SUB><SUP>Met</SUP></UP> binding domain of IF2 from the bacterium Bacillus stearothermophilus has been solved using NMR methods (11). There is, however, no previously reported crystallographic or NMR structural study of Domain I at the N terminus of bacterial IF2 (Fig. 1), which is the subject of the present work.


View larger version (21K):
[in this window]
[in a new window]
 
Fig. 1.   Schematic diagram of E. coli IF2. The protein encoded by the infB gene is described with the positions of the alternative initiation sites indicated. Full-length E. coli IF2 is referred to as IF2-1; infB gene products that begin on the alternate initiation sites are referred to as IF2-2 and IF2-3. Domains IV-VI are widely conserved in all three phylogenetic kingdoms, whereas Domains I-III are more variable in primary structure between species. A ribbon diagram of the M. thermoautothrophicum aIF5B structure derived from PDB entry 1G7R is shown (10); the structure of this archaeal protein is homologous to Domains IV, V, and VI of the E. coli IF2. The present study focuses on the region of E. coli IF2-1 that precedes the first alternate initiation site, consisting of the residues up to 157.

Bacterial IF2 is encoded by the infB gene, which in Escherichia coli encodes three forms of IF2: IF2-1, -2, and -3, of molecular masses 97.3, 79.9, and 78.8 kDa, respectively (12). The expression of IF2-2 and IF2-3 in E. coli is by tandem translation of the intact infB mRNA, and not by translation of post-transcriptionally truncated mRNA. Hence, the three different forms of IF2 have identical C termini (13). The presence of both the large and smaller forms is required for optimal growth of E. coli. The cellular content of IF2-2 and -3 is close to the level of IF2-1 (14, 15). The presence of more than one isoform of IF2 is not a phenomenon peculiar to E. coli, but has been found in several other enterobacteria (16).

The N-terminal region of IF2 differs from the C-terminal region in that there is significantly more variability between species in primary structure as well as length. We have previously used sequence data and biochemical experiments to divide the N-terminal region of E. coli IF2 into three separate domains designated Domain I, II, and III (17), as illustrated in Fig. 1. A function for the domains in the N-terminal region has been demonstrated in E. coli, where a fragment of IF2 consisting of Domains I and II, but not a fragment consisting of Domain I alone, binds to the 30 S ribosomal subunit (18, 19). Furthermore, we have recently used a primer extension inhibition assay to identify Domains I-II of E. coli IF2 as an interaction partner for the infB mRNA (16).

The present work describes the results of nuclear magnetic resonance (NMR) and circular dichroism (CD) experiments used to characterize the 18-kDa Domain I of E. coli IF2. A search among structures in the Protein Data Bank revealed that this domain has no significant primary sequence homology to any protein of known structure, and a BLAST search (20) of the non-redundant protein sequence data available at the National Center for Biotechnology Information (NCBI) Web site showed that the domain has no significant sequence homologue other than the same domain within IF2 of different species.

    EXPERIMENTAL PROCEDURES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Protein Cloning, Expression, and Purification-- The fragment of the infB gene encoding the first domain of IF2-1 was amplified by PCR using E. coli K12 as template and primers that included unique restriction sites for XbaI and NdeI for insertion into the pET-15b expression vector (Novagen). DNA sequencing confirmed the insertion of infB into the vector. The protein was expressed in BL21(DE3) cells (Novagen). Cells were grown in M9 minimal medium supplemented with 100 mg/liter ampicillin. Protein expression was induced with 0.1 mM isopropyl-1-thio-beta -D-galactopyranoside when the cells reached an OD550 of 0.6. Cells from a 1-liter culture were harvested by centrifugation and dissolved in 20 ml of buffer A (50 mM Hepes, pH 7.6, 10 mM MgCl2, 1 mM dithiothreitol, 0.1 mM phenylmethylsulfonyl fluoride, 15 mM NaN3). The solution was passed once through a French pressure cell at 1500 PSI and centrifuged at 30,000 × g for 1 h. The supernatant was loaded on a 40-ml SP Sepharose FF column (Amersham Biosciences), and bound protein was eluted with a 0-200 mM NaCl step gradient. The buffer was changed to buffer A using a Sephadex G25 column (Amersham Biosciences). The pooled fraction was passed through a Source 30Q column, and the unbound protein was loaded on a 20-ml SP Sepharose HP column (Amersham Biosciences). The IF2 Domain I was eluted with a gradient from 0 to 200 mM NaCl over 8 column volumes, yielding 40 mg of pure protein per 1 liter of culture medium. The purified protein was subjected to N-terminal sequencing by Edman degradation, and the protein mass was determined by MALDI-TOF analysis. Samples of protein enriched in 15N or 15N and 13C simultaneously were prepared as described above, but cells were grown in M9 minimal medium containing 1.5 g/liter [13C] glucose and/or 0.6 g/liter [15N] ammonium chloride (Cambridge Isotope Laboratories) as sources of carbon and nitrogen, respectively.

Circular Dichroism Spectroscopy-- The circular dichroism spectra were recorded on the UV1 photobiology synchrotron beamline at the Institute for Storage Ring Facilities at Aarhus University, Denmark, using synchroton radiation provided by the ASTRID storage ring. Spectra were recorded in 10 mM phosphate buffer, pH 6.0 using an open 0.01-mm Hellwa suprisil quartz cell. The data were acquired using 5 consecutive scans with 1-nm intervals in the range 180-250 nm. Spectra of each sample were recorded from 5 to 70 °C in 5 °C steps. The sample was allowed to equilibrate at each temperature for 20 min before acquiring the spectra. The data were normalized to a 1 mg/ml concentration in a 1-mm path length cell.

NMR Spectroscopy-- NMR spectra were recorded at 20, 30, and 40 °C using a 500 MHz Varian Inova spectrometer equipped with a triple-resonance probe and z-axis pulsed-field gradient. NMR samples typically contained 2-3 mM of the protein and 10 mM sodium phosphate in 90% H2O/10% D2O or 100% D2O solvent at pH 6.0. Backbone resonance assignments were obtained by analyzing HNCA, HNCO, HNCACB, HN(CO)CACB, and HACACBCO spectra, which correlate the backbone protons to the N, Ca, Cb, and CO signals of the same and adjacent amino acid residues. 15N-edited HSQC-TOCSY, 13C-edited HCCH-TOCSY, and two-dimensional 2QF-COSY and TOCSY spectra were used for side-chain resonance assignments. NOE cross-peaks were detected using two-dimensional 1H-1H NOESY, three-dimensional 15N-resolved 1H-1H HSQC-NOE, and three-dimensional 13C-edited 1H-1H HSQC-NOESY spectra. The 13C-edited 1H-1H NOE spectrum was collected in 90% H2O/10%D2O solvent, so that NOE peaks between amide and side-chain protons could be resolved by the chemical shift of a side-chain 13C nucleus. Rapidly exchanging amide protons were identified by comparing 15N-1H correlated spectra obtained with selective excitation versus presaturation for solvent suppression. Data were processed using either the program NMR-Pipe (21) or Felix 1.0 (Hare Research). 1H, 15N, and 13C chemical shifts are referenced as recommended by Ref. 22, with proton chemical shifts referenced to internal 2,2-dimethyl-2-silapentane-5-sulfonate (DSS) at 0 ppm. The 0 ppm 13C and 15N reference frequencies were determined by multiplying the 0 ppm 1H reference by 0.251 449 530 and 0.101 329 118, respectively.

Structure Determination-- Structure calculations for IF2 Domain I were performed using the hybrid distance geometry-simulated annealing and energy minimization protocols within the CNS version 1.1 program suite (23). Distance restraints were derived from multidimensional NOE spectra. In order to minimize the effects of spin diffusion, as many of the NOE cross-peaks as possible were identified in homonuclear two-dimensional NOE spectra acquired with relatively short mixing times (60 ms); these spectra also offered the best digital resolution. Peaks from these short mixing time spectra were placed into four categories: strong (<3.2 Å), medium (<3.6 Å), weak(< 4.2 Å), and very weak (<4.6 Å). Additional NOE cross-peaks were identified in the three-dimensional 15N- and 13C-edited NOE spectra (60-ms mixing time) and assigned to distance restraints as strong (<5.0 Å), medium (<5.5 Å), weak (<6.3 Å), and very weak (<6.9 Å). A very conservative distance restraint of <7.9 Å was used for NOE cross-peaks identified in spectra obtained with a relatively long mixing time (160 ms), where the effects of spin diffusion are most likely to be present. Pseudoatom corrections were included for NOEs including stereospecifically unassigned methyl protons of Val or Leu, where distances were measured from the center of the two methyl groups, and 2.5 Å was added to the interproton distance. For NOEs involving other methyl groups distances were measured from the center of the methyl group, and 1.0 Å was added to the interproton distance. Stereospecifically unassigned methylene groups were treated the same way, and 0.7 Å was added to the interproton distance. For NOEs involving delta  and epsilon  protons of Phe, distances were measured from the central point between the two atoms, and 2.4 Å was added to the interproton distance. For regions of regular alpha -helix or beta -strand structure identified by characteristic NOE patterns and chemical shift indices (CSI) (24), backbone torsion angle restraints were included for the phi and psi  angles. For beta -strands, phi and psi  angles were restricted to -120o ± 25o and 150o ± 25o, respectively, and for alpha -helices both phi and psi  angles were restricted to -60o ± 25o. Hydrogen bond restraints were only included for amide protons with relatively slow solvent exchange rates that are also located in regions of regular alpha -helix or beta -sheet structure. Twenty diverse starting structures were generated by subjecting a random coil model to the CNS simulated annealing protocol using only the dihedral angle and hydrogen bond constraints. These structures were then used as starting models for 200 runs of the simulated annealing protocol. Most of the simulated annealing runs resulted in similar structures with similar energies. From this final set of refined models, a set of 20 structures were selected that satisfy the following criteria: 1) their CNS energy term is at or very near the minimum value obtained, 2) there are no interproton distance constraint violations of greater than 0.5 Å, 3) the set of models are a fair representation of the full range of structures that satisfy the NMR-derived restraints while having reasonable molecular geometry, as defined by the CNS energy function. Structural statistics (Table I) were calculated with the assistance of the program PROCHECK-NMR (25).

15N NMR Relaxation Rates-- The 15N T1 and T2 relaxation times and the 15N-1H NOE were measured using pulse sequences (26) that feature gradient selection and sensitivity enhancement, and pulses for minimizing saturation of the solvent water. Six two-dimensional spectra with relaxation delays of 10, 260, 510, 760, 1010, and 1260 ms were acquired for the T1 relaxation measurements, and six two-dimensional spectra with relaxation delays of 29, 58, 87, 116, 145, and 174 ms were acquired for the T2 relaxation measurements; in each case the relaxation delay between the acquisition of each free induction decay was 3 s. The spectra for measuring the 15N-1H NOE were acquired with either a 5-s delay between each free induction decay or a 1-s delay followed by a 4-s long series of 120o nonselective 1H pulses. The T1 and T2 data were fitted to a single exponential decay function of the form I = I0e-t/Td, in which I is the intensity of the signal at time t, I0 is the intensity at time t = 0 and Td is the decay constant T1 or T2, respectively. Rotational correlation times and order parameters were calculated using Modelfree 4.0 (27) as previously described (28).

    RESULTS
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Domain I of E. coli translation initiation factor IF2 was recombinantly expressed, and the purified domain found to be soluble, stable, and well suited for study using biophysical methods. MALDI-TOF mass spectrometry revealed that the N-terminal methionine residue was post-translationally removed from the protein. CD spectra contain features typical of a protein with substantial alpha -helical content (Fig. 2A), with characteristic minima at 207 and 222 nm. CD spectra recorded at 30 °C or below look essentially the same, whereas spectra recorded at higher temperatures differ significantly, presumably due to unfolding of the protein (Fig. 2B). The circular dichroism at 207 nm increases with increasing temperature, consistent with a decrease in helical content. The presence of an isodichroic point in the CD spectra indicates that the unfolding is a two-state process. The protein can be reversibly denatured by heating; a sample heated to 70 °C and then cooled to 20 °C has a CD spectrum that is the same as that recorded before heating of the sample. Two-dimensional NMR spectra acquired at 20 and 30 °C are essentially identical and of excellent quality, with the majority of the resonances being well dispersed, as is typical for a folded protein (Fig. 3). However, most NOE cross-peaks were absent in spectra acquired at 40 °C. The NMR and CD data are therefore consistent in indicating that the domain starts to lose structure between 30 and 40 °C.


View larger version (12K):
[in this window]
[in a new window]
 
Fig. 2.   Circular dichroism spectra of IF2 Domain I. A, circular dichroism spectra recorded at temperatures from 5 to 70 °C. 20 °C recovered refers to a spectrum recorded at 20 °C for a sample that has previously been heated to 70 °C. The arrow indicates the isodichroic point. B, The circular dichroism in millidegree at 222 nm for Domain I is plotted against temperature to demonstrate the rate of conformational change with respect to temperature. Note the change in slope of the curve at ~30 °C.


View larger version (17K):
[in this window]
[in a new window]
 
Fig. 3.   15N-1H correlated HSMQC spectrum of Domain I of IF2. The spectrum was acquired at 30 °C in 10 mM phosphate at pH 6. Resonance assignments of the most well resolved cross-peaks are labeled. Chemical shift assignments for the domain of IF2 have been submitted to the BioMagResBank and assigned the accession number BMRB-5624.

An abundance of inter-residue NOE cross-peaks were observed for residues 2-50 of Domain I, consistent with these residues forming a compact globular subdomain. In contrast, residues 51-157 only exhibit NOE cross-peaks between pairs of protons that are relatively close together in the primary sequence; these residues are therefore likely to form a linker region connecting the compact folded structure formed by amino acids 2-50 with the other domains of IF2. The structural details of each of these subdomains within Domain I of IF2 will be discussed in turn. A summary of local NOE patterns and CSI, which define the secondary structure of the protein are given in Fig. 4.


View larger version (27K):
[in this window]
[in a new window]
 
Fig. 4.   Schematic representation of NMR and secondary structure data. Circles in the row labeled Calpha indicate residues for which the Calpha chemical shifts have been unambiguously assigned. The circles are linked in regions where NMR assignments have been made by matching Calpha , Cbeta , and/or CO chemical shifts obtained in triple-resonance spectra. Sequential NOE connectivities involving the amide and alpha  protons as well as medium range NOE connectivities (dalpha N (i,i+3)) indicative of helical structure are shown using horizontal bars. Only unambiguous (non-overlapping) NOE information is included. Halpha , Calpha , and CO CSI are calculated as described in Ref. 24 and correlate well with the identified secondary structural elements. The repetitive amino acid sequence of residues 98-157 prevented complete resonance assignments due to overlap in the spectra; however, those NOEs that are clearly resolved, as well as the CSI values of the overlapping peaks, suggest a helical structure for these 60 residues.

Structure of a Conserved Subdomain (Residues 2-50)-- The structure of residues 2-50 at the N terminus of IF2 was determined from distance constraints derived from observed NOE intensities, and torsion angle and hydrogen bond constraints derived for the regions identified as having regular beta -sheet or helical structure. Complete backbone and nearly complete side chain chemical shift assignments were obtained for the 1H, 13C, and 15N nuclei in the subdomain. Of particular significance, complete 1H resonance assignments were obtained for all of the leucine, isoleucine, valine, phenylalanine, and alanine side chains (among others) in the subdomain; assignment of NOE cross-peaks derived from these side chains were critical in defining the hydrophobic core. A superposition of a set of structures that are equally consistent with more than 900 NMR-derived constraints is shown in Fig. 5. These structures are a fair representation of the full range of structures that are consistent with the NMR data. Structural statistics for residues 2-50 are summarized in Table I. Coordinates for the subdomain have been deposited in the Protein Data Bank (PDB), where it has been assigned accession number 1ND9.


View larger version (17K):
[in this window]
[in a new window]
 
Fig. 5.   Diagrams of the N-terminal subdomain (residues 2-50) of IF2 Domain I. A, superposition of the backbones of 20 low energy structures. The ensemble is color-ramped from blue at the N terminus to red at the C terminus of the protein domain. The models are a fair representation of the full range of structures that are consistent with the NMR-derived constraints; each structure is equally consistent with the NMR data. The coordinates for IF2 have been submitted to the Protein Data Bank and have been assigned PDB code 1ND9. B, ribbon diagram. The domain is shown with the twisted beta -sheet in front. C, The domain is rotated 90o. The secondary structure elements are color-coded as follows: beta 1, purple; alpha 1, cyan; alpha 2, dark green; beta 2, light green; beta 3, yellow, and alpha 3, red. The figure was made using the program Swiss PDB Viewer (40).


                              
View this table:
[in this window]
[in a new window]
 
Table I
Summary of refinement and structural statistics for E. coli IF2 res. 2-50
Statistics are derived from a set of 20 low-energy structures, a set that is representative of the range of structures that are consistent with the structural constraints.

The NMR results show that the first 50 amino acids of IF2 form a compact structure consisting of three beta -strands and three short alpha -helices. The three helices are nearly orthogonal, and are located on the same side of an antiparallel twisted sheet formed by three beta -strands. Strands beta 1 (residues 3-6) and beta 2 (residues 29-32) are linked by helices alpha 1 (residues 8-12) and alpha 2 (residues 16-26), strand beta 3 (residues 35-39) is connected to beta 2 by a short loop, and the compact subdomain terminates with helix alpha 3 (residues 42-50). Alignments of IF2 sequences from different species show that hydrophobic residues Ile-6, Leu-9, Val-17, Leu-20, Val-21, Phe-24, Ala-27, Ile-29, Val-37, Leu-45, Ile-46, and Leu-49 are conserved in a wide range of species (Figs. 6 and 7); these residues are all buried and form the core of the subdomain structure. A ribbon diagram depicting the fold of the subdomain is shown in Fig. 5.


View larger version (56K):
[in this window]
[in a new window]
 
Fig. 6.   Alignment of the N-terminal 157 amino acids of IF2. The sequence of IF2 from E. coli, the subject of the present study, is compared with the homologous amino acids of IF2 from other proteobacteria within the beta  and gamma  subdivisions. The positions of the secondary structure elements are indicated. The conserved hydrophobic amino acids that make up the hydrophobic core of the subdomain are boxed. Additional conserved hydrophobic residues are indicated by stars. Small vertical arrows indicate exposed hydrophilic residues that are well conserved among the beta  and gamma  proteobacteria. Although only eight sequences are shown in the figure, a larger set of sequences was compared in deciding which residues are the most conserved.


View larger version (54K):
[in this window]
[in a new window]
 
Fig. 7.   Alignment of the N-terminal 50 amino acids of IF2 in a diverse set of bacteria. The alignment provides evidence that the structure of the 50 amino acid alpha /beta motif at the N terminus of Domain I is well conserved among a wide range of bacterial species. The conserved residues that form the hydrophobic core of the 50 amino acid N-terminal motif are shown in red and indicated with red arrows. Sequences of IF2 were chosen so as to represent a much more divergent set of bacteria than that shown in Fig. 5. E. coli belongs to the gamma  subdivision of the proteobacteria; Desulfovibrio desulfuricans and Geobacter metallireducens belong to the Delta  subdivision of the proteobacteria; Campylobacter jejuni belongs to the epsilon  subdivision of the proteobacteria; B. subtilis, B. stearothermophilus, Streptococcus pneumoniae, and Lactococcus lactis are members of the firmicutes; Nostoc punctiforme, Prochlorococcus marinus, Synechocystis, Thermosynechococcus elongatus, and Trichodesmium erythraeum belong to various subdivisions of the cyanobacteria; Deinococcus radiodurans and Thermus thermophilus belong to the thermus/deinococcus group; Thermotoga maritima represents the thermotogae; Chlorobium tepidum represents the chlorobi, and Chlamydia pneumoniae represents the chlamydiae group. Alignment of the conserved hydrophobic residues does not require any gaps except for the single amino acid deletion between residues 35 and 36 in the E. coli sequence; this deletion also occurs in some of the other beta  and gamma  proteobacteria (Fig. 6) and is in a loop region of the structure. Although an excellent alignment was found for the first 50 amino acids of IF2 from each species shown, significant similarity was not found for the amino acids in the region corresponding to 51-157 of the E. coli sequence.

In previous work, monoclonal antibodies were generated against IF2 from E. coli (17). One of these antibodies was epitope mapped on native full-length IF2 to the region of beta 2-beta 3 and the loop connecting these strands, suggesting that this region may be solvent-exposed in the full-length IF2; however, we note that epitope mapping is not an unambiguous indicator of solvent-exposed residues.

The coordinates for the structure of the IF2 N-terminal subdomain were compared against a data base of known structures using the Vector Alignment Search Tool (VAST), located at the NCBI Web page, and the DALI search tools (29). The subdomain was found to be structurally similar to the B5 motif within Phe-tRNA synthetase (Fig. 8); the structures share the same beta alpha alpha beta beta alpha topology, and the rmsd of the backbone atoms is 3.3 Å. The beta 1-strand and the three helices (alpha 1, alpha 2, and alpha 3) in the domain of IF2 also superimpose well with a domain of methionyl-tRNA synthetase and glutaminyl-tRNA synthetase known as the stem contact (SC) fold (30). The rmsd of the backbone between the SC fold domains in the E. coli aminoacyl tRNA synthetases and the secondary structure elements beta 1, alpha 1, alpha 2, and alpha 3 of the subdomain of IF2 are 2.4 and 2.6 Å, respectively (Fig. 8).


View larger version (22K):
[in this window]
[in a new window]
 
Fig. 8.   Relationship of the N-terminal subdomain (residues 2-50) of IF2 to structurally similar motifs found in tRNA synthetases. A, left, ribbon diagram of the subdomain (residues 2-50) of IF2 (PDB entry: 1ND9). Right, ribbon diagram of the B5 subdomain of phenylalanyl-tRNA synthetase (PheRS) from Thermus thermophilus (PDB entry: 1EIY). The two structures share the same beta alpha alpha beta beta alpha topology, and residues that structurally align in the two structures are shown in the same color. The rmsd between the backbones of the two structures is 3.3 Å. B, left, ribbon diagram of the stem contact fold domain of methionyl-tRNA synthetase (MetRS) from E. coli (PDB entry: 1QQT). Middle, ribbon diagram of residues 2-50 of IF2. Right, ribbon diagram of the stem contact fold domain of glutaminyl-tRNA synthetase (GlnRS) from E. coli (PDB entry: 1QTQ). Residues that structurally align with each other are shown in the same color. The rmsds between the backbone of the Domain of IF2 and the domains from the methionyl-tRNA synthetase and the glutaminyl-tRNA synthetase are 2.4 and 2.6 Å, respectively. The color encoding is the same as in Fig. 4. The figure was made using the program Swiss PDB Viewer (40).

Structure of the Less Conserved Region (Residues 51-97)-- Residues 51-97 are not as widely conserved as residues 2-50, as indicated in Fig. 6. Although chemical shifts were assigned for most nuclei (all those except residues 63-67, 88, and 97), the large majority of chemical shifts of the backbone as well as side-chain atoms appear at or very near to the random coil values, even for the very hydrophobic side chains. The only identified NOEs were intraresidue, sequential, and in a few cases medium range (Fig. 4). Without any observed long range NOEs, the three-dimensional structure of this region of the protein cannot be determined from the NMR data. Local NOE patterns, as well as Calpha , CO, and Halpha chemical shifts indicate that residues Val-83-Val-85 are likely to be in an extended (beta -strand) conformation, however no long range NOEs to connect this strand to any region of the N-terminal subdomain were found. We conclude that residues 51-97 do not have a well defined structure under the conditions used for the NMR experiments. However, it is possible that these residues are structured in the context of the full-length IF2.

Structure of the C-terminal Region of Domain I (Residues 98-157)-- The 60-residue C-terminal region of IF2 Domain I contains a repetitive amino acid sequence where every fourth residue is an alanine, and the alanines are separated by sequences rich in glutamate, glutamine, lysine, and arginine. The pattern Arg-Glu-Ala is repeated six times. The repetitive sequence made chemical shift assignments particularly challenging for this region of the molecule. Unambiguous sequence specific assignments were obtained for residues 98-104, 120-126, 149-157; these residues exhibit the sequential and medium range NOE patterns indicative of alpha -helical conformation, as well as Calpha , CO, and Halpha chemical shifts typical of helical structure (Fig. 4). Another helical Ala-Ala-Glu unit was resolved but could not be placed unambiguously in the sequence. Unresolved, overlapping peaks observed in the triple-resonance spectra at the chemical shifts expected for arginine, glutamate, and glutamine in alpha -helical conformation probably account for the resonances of the remaining residues in the C-terminal region of Domain I.

The 60-residue helix at the C terminus of Domain I has the potential to form a coiled-coil structure, resulting in dimerization of the protein. Analytical ultracentrifugation was therefore used to test for the presence of a homodimeric structure. The sedimentation coefficients of IF2 Domain I (residues 2-157), hen egg white lysozyme (129 residues), and bovine carbonic anhydrase (259 residues) were determined after being dialyzed against identical solutions (1 mM phosphate buffer, pH 6, 20 °C), the sedimentation coefficients of the three proteins were found to be 1.46 S, 1.85 S, and 2.85 S, respectively. The observation that Domain I has a sedimentation coefficient that is low for its molecular weight (and less than that of lysozyme) can be best explained by the domain being a monomer with a significantly non-spherical shape. Consistent with the ultracentrifugation results, a half-filter nuclear Overhauser effect experiment designed to detect inter-subunit NOEs in a mixture of unlabeled and 13C/15N-labeled protein (31) provided no evidence for dimerization.

The mostly hydrophilic nature of residues 98-157 suggests that this helical structure is solvent exposed, perhaps forming a linker connecting Domain I with the other domains of IF2 similar to the helical linker that connects Domain VI-1 with VI-2 in IF2 (Fig. 1). Earlier epitope mapping studies of monoclonal antibodies on native IF2 from E. coli identified two non-overlapping epitopes in the region of residues 108-137 (17); this provides some independent evidence supporting the hypothesis that these residues are solvent-exposed in the full-length IF2 (however, it is again noted that epitope mapping is not an unambiguous indicator of solvent exposure). Although the precise amino acid sequence of residues 98-157 is not conserved, many of the bacteria contain similar repetitive sequences of mostly hydrophilic amino acids with a high propensity for forming an alpha -helix (Fig. 6), suggesting that the helical linker may be a feature that is present in Domain I of IF2 of many of the bacteria.

15N Relaxation Rates and Internal Motions within Domain I of IF2-- 15N relaxation rate data (T1, T2, and 15N-1H NOE) were obtained for 48 backbone amide nitrogens of residues in the range 6-157 that have resonances that are well resolved in two-dimensional 15N-1H correlated spectra. In terms of 15N-1H relaxation rates, the domain can be divided into three regions. For residues 6-50, the observed values for T1, T2 and the 15N-1H NOE are strikingly uniform, averaging 0.54 s, 0.14 s, and 0.62, respectively. These values are consistent with a well ordered structure with a rotational correlation time of 6.7 ns. This rotational correlation time is typical of protein with a molecular weight significantly less than that of the full 156 residue Domain I, and therefore suggests that the first 50 residues form a relatively rigid subdomain that moves independently of the other regions of the protein. Residues 55-95 exhibit negative values for the 15N-1H NOE and relatively long T2 relaxation times, indicating a flexible and disordered structure. For the long helix at the C terminus of Domain I, only four residues (123, 149, 152, and 157) have amide resonances that are well enough resolved for relaxation rate data to be obtained. These residues differed widely in their relaxation rates, suggesting that the motions of the long helix cannot be described using a simple model.

    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

How well conserved is the compact alpha /beta subdomain that we have identified at the N terminus of IF2? To systematically address the issue of structural conservation, the amino acid sequences at the N termini of IF2 were examined for a set of 68 diverse bacteria whose genomes have been sequenced. IF2 in 49 of 68 bacteria contains an N-terminal sequence that is clearly homologous to the alpha /beta motif at the N terminus of E. coli IF2, as indicated by the strong conservation of the residues that form the hydrophobic core of the subdomain (Fig. 7). IF2 in 13 of the 68 organisms may contain a homologous N-terminal structure; in these cases it was difficult to be certain, since aligning the sequences required the insertion of one or more gaps. Only 6 of 68 species clearly do not appear to contain a sequence homologous to the 50-amino acid N-terminal motif. An example of these organisms is Mycoplasma genitalium, the bacterium with the smallest known genome; in this species IF2 is unusually small, containing only 620 amino acids rather than the ~900 amino acids found in IF2 of most bacteria. In summary, our investigation of the amino acid sequences of IF2 in various species indicates that the large majority of the bacteria contain a 50 amino acid alpha /beta motif at the N terminus of their IF2 that is structurally homologous to that found in E. coli.

Although the conserved hydrophobic residues indicate that the overall shape of the 50-amino acid subdomain is well conserved across a wide range of bacteria, it is interesting that there are no surface residues that are as well conserved as the hydrophobic core. For example, the surface residues at positions 18, 41, and 42 in the beta  and gamma  proteobacteria are conserved in terms of their charge (indicated by black arrows in Fig. 6), however, residues at these same positions are not conserved in a broader range of bacterial species (Fig. 7). Conversely, there are other surface residues that are conserved in a wide range of bacteria (such as lysine 3 in Bacillus subtilis, Fig. 7) that differ in the beta  and gamma  proteobacteria (Fig. 6). The strong conservation of the subdomain structure suggests its general importance, while the variability of its surface residues could be explained by its precise function being more species-specific.

The present data indicate a structural relationship between the N-terminal subdomain and the SC fold domains in the class Ia aminoacyl tRNA synthetases. The SC fold domain docks to the inner side of the L-shaped tRNA, thereby positioning the anticodon stem of the tRNA (30). The domain connects the acceptor and anticodon binding domains of the tRNA synthetases and may provide a functional communication of anticodon recognition between the anticodon binding domain to the acceptor region binding domain and active site of the synthetases (32). Interestingly, even though a high degree of structural homology is found for the fold in the Gln- and Met-tRNA synthetases, no significant sequence similarity exists, although generally the C-terminal helix in the motif has a net negative charge despite its location adjacent to the highly negatively charged phosphate backbone of the tRNA (33). A similar pattern is found in the case of IF2, where there is no obvious sequence homology for the surface residues of the subdomain, but a net negative charge is found for the region corresponding to helix alpha 3 of the subdomain in the great majority of bacterial sequences investigated.

IF2 interacts with the initiatior fMet-tRNA<UP><SUB>f</SUB><SUP>Met</SUP></UP> during the initiation of translation. This interaction has been shown to mainly be dependent on the extreme C-terminal domain of IF2 (34, 35). However, a part of domain II of E. coli IF2 (Fig. 1) has been cross-linked to the anticodon arm of fMet-tRNA<UP><SUB>f</SUB><SUP>Met</SUP></UP> (36). It may be speculated that IF2 and methionyl-tRNA synthetase interact with the anticodon stem region of the initiator tRNA in a similar manner, since they each have a similarly structured subdomain and both interact with the initiator tRNA. Moreover, footprinting studies show that IF2 and methionyl-tRNA synthetase protect the same position in the anticodon stem of the initiator tRNA against RNase cleavage (37, 38). The SC-fold domain in methionyl-tRNA synthetase is followed by a helix bundle that interacts with the anticodon stem and loop of the tRNA (39). The present study indicates a helical region in Domain I (residues 98-157). The N-terminal region of Domain II of bacterial IF2 also contains a region of residues with high helix forming propensity, and circular dichroism spectra of the isolated Domain II of E. coli IF2 indicate a mainly helical structure (not shown). This raises the possibility that the C-terminal region of Domain I and the N-terminal region of Domain II fold into a helix bundle similar to the helix bundle in the methionyl-tRNA synthetase, and interact with the anticodon stem region. This speculation is supported by the previous cross-linking (36) and footprinting (37, 38) data. It should be noted, however, that the binding mechanisms must differ since methionyl-tRNA synthetase covers the anticodon of the tRNA. Obviously, the anticodon must be free to interact with the initiation codon of the mRNA during translation initiation.

Further studies are clearly required to fully characterize the structural and functional properties of the N terminus of IF2, and to address the puzzling question of why more than one isoform of IF2, differing only in the presence of Domain I, exists in the enterobacteria. The structural view of Domain I provided by the present work will aid in the design of experiments directed toward determining more specifically the details of the interactions between Domains I-II of IF2, the 30 S ribosomal subunit, and the infB mRNA.

    ACKNOWLEDGEMENTS

We thank John Kenney and Cedric Dicko at the Institute for Storage Ring Facilities, Aarhus University, Denmark for help with Circular Dichroism.

    FOOTNOTES

* This work was funded by Grants 9901722 and 51-00-0263 from the Familien Hede Nielsens Fund and the Danish Natural Science Research Council (to H. U. S.-P.).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Chemical shift assignments for the domain of IF2 have been submitted to the BioMagResBank and assigned the accession number BMRB-5624.

The atomic coordinates and the structure factors (code 1ND9) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/).

§ Supported by a personal grant from Knud Højgaards Fund, Denmark.

To whom correspondence may be addressed: Dept. of Molecular Biology, University of Aarhus, Denmark. Tel.: 45-89425050; Fax: 45-86182812; E-mail: husp@biobase.dk.

** Supported by Grant F-1353 from the Welch Foundation. To whom correspondence may be addressed: Dept. of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, University of Texas, Austin, TX 78712. Tel.: 512-471-7859; Fax: 512-471-8696; E-mail: dhoffman@mail.utexas.edu.

Published, JBC Papers in Press, February 24, 2003, DOI 10.1074/jbc.M212960200

    ABBREVIATIONS

The abbreviations used are: IF, initiation factor; rmsd, root mean square deviation; SC, stem contact; MALDI-TOF, matrix-assisted laser desorption/ionization time-of-flight; CSI, chemical shift index; PDB, protein data bank.

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

1. Ramakrishnan, V. (2002) Cell 108, 557-572[Medline] [Order article via Infotrieve]
2. Moreno, J. M., Sørensen, H. P., Mortensen, K. K., and Sperling-Petersen, H. U. (2000) IUBMB Life 50, 347-354[CrossRef][Medline] [Order article via Infotrieve]
3. Gualerzi, C. O., Brandi, L., Caserta, E., Teana, A. L., Spurio, R., Tomsic, J., and Pon, C. L. (2000) in The Ribosome: Structure, Function, Antibiotics, and Cellular Interactions (Garret, R. A. , Douthwaite, S. R. , Liljas, A. , Matheson, A. T. , Moore, P. B. , and Noller, H. F., eds) , pp. 477-494, ASM Press, Washington, D. C.
4. Boelens, R., and Gualerzi, C. O. (2002) Curr. Protein Pept. Sci. 3, 107-119[Medline] [Order article via Infotrieve]
5. Steffensen, S. A., Poulsen, A. B., Mortensen, K. K., and Sperling-Petersen, H. U. (1997) FEBS Lett. 419, 281-284[CrossRef][Medline] [Order article via Infotrieve]
6. Sørensen, H. P., Hedegaard, J., Sperling-Petersen, H. U., and Mortensen, K. K. (2001) IUBMB Life 51, 321-327[CrossRef][Medline] [Order article via Infotrieve]
7. Kyrpides, N. C., and Woese, C. R. (1998) Proc. Natl. Acad. Sci. U. S. A. 95, 224-228[Abstract/Free Full Text]
8. Choi, S. K., Lee, J. H., Zoll, W. L., Merrick, W. C., and Dever, T. E. (1998) Science 280, 1757-1760[Abstract/Free Full Text]
9. Pestova, T. V., Lomakin, I. B., Lee, J. H., Choi, S. K., Dever, T. E., and Hellen, C. U. (2000) Nature 403, 332-335[CrossRef][Medline] [Order article via Infotrieve]
10. Roll-Mecak, A., Cao, C., Dever, T. E., and Burley, S. K. (2000) Cell 103, 781-792[Medline] [Order article via Infotrieve]
11. Meunier, S., Spurio, R., Czisch, M., Wechselberger, R., Guenneugues, M., Gualerzi, C. O., and Boelens, R. (2000) EMBO J. 19, 1918-1926[Abstract/Free Full Text]
12. Nyengaard, N. R., Mortensen, K. K., Lassen, S. F., Hershey, J. W., and Sperling-Petersen, H. U. (1991) Biochem. Biophys. Res. Commun. 181, 1572-1579[Medline] [Order article via Infotrieve]
13. Mortensen, K. K., Hajnsdorf, E., Regnier, P., and Sperling-Petersen, H. U. (1995) Biochem. Biophys. Res. Commun. 214, 1254-1259[CrossRef][Medline] [Order article via Infotrieve]
14. Howe, J. G., and Hershey, J. W. (1983) J. Biol. Chem. 258, 1954-1959[Abstract/Free Full Text]
15. Sacerdot, C., Vachon, G., Laalami, S., Morel-Deville, F., Cenatiempo, Y., and Grunberg-Manago, M. (1992) J. Mol. Biol. 225, 67-80[Medline] [Order article via Infotrieve]
16. Laursen, B. S., Steffensen, S. A., Hedegaard, J., Moreno, J. M., Mortensen, K. K., and Sperling-Petersen, H. U. (2002) Genes Cells 7, 901-910[Abstract/Free Full Text]
17. Mortensen, K. K., Kildsgaard, J., Moreno, J. M., Steffensen, S. A., Egebjerg, J., and Sperling-Petersen, H. U. (1998) Biochem. Mol. Biol. Int. 46, 1027-1041[Medline] [Order article via Infotrieve]
18. Moreno, J. M., Kildsgaard, J., Siwanowicz, I., Mortensen, K. K., and Sperling-Petersen, H. U. (1998) Biochem. Biophys. Res. Commun. 252, 465-471[CrossRef][Medline] [Order article via Infotrieve]
19. Moreno, J. M., Drskjotersen, L., Kristensen, J. E., Mortensen, K. K., and Sperling-Petersen, H. U. (1999) FEBS Lett. 455, 130-134[CrossRef][Medline] [Order article via Infotrieve]
20. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Nucleic Acids Res. 25, 3389-3402[Abstract/Free Full Text]
21. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., and Bax, A. (1995) J. Biomol. NMR 6, 277-293[Medline] [Order article via Infotrieve]
22. Wishart, D. S., Bigam, C. G., Yao, J., Abildgaard, F., Dyson, H. J., Oldfield, E., Markley, J. L., and Sykes, B. D. (1995) J. Biomol. NMR 6, 135-140[Medline] [Order article via Infotrieve]
23. Brünger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J. S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T., and Warren, G. L. (1998) Acta Crystallogr. Sect. D Biol. Crystallogr. 54, 905-921[CrossRef][Medline] [Order article via Infotrieve]
24. Wishart, D. S., and Sykes, B. D. (1994) Methods Enzymol. 239, 363-392[Medline] [Order article via Infotrieve]
25. Laskowski, R. A., Rullmannn, J. A., MacArthur, M. W., Kaptein, R., and Thornton, J. M. (1996) J. Biomol. NMR 8, 477-486[Medline] [Order article via Infotrieve]
26. Farrow, N. A., Muhandiram, R., Singer, A. U., Pascal, S. M., Kay, C. M., Gish, G., Shoelson, S. E., Pawson, T., Forman-Kay, J. D., and Kay, L. E. (1994) Biochemistry 33, 5984-6003[Medline] [Order article via Infotrieve]
27. Lillemoen, J., and Hoffman, D. W. (1998) J. Mol. Biol. 281, 539-551[CrossRef][Medline] [Order article via Infotrieve]
28. Mandel, A. M., Akke, M., and Palmer, A. G., III (1995) J. Mol. Biol. 246, 144-163[CrossRef][Medline] [Order article via Infotrieve]
29. Holm, L., and Sander, C. (1993) J. Mol. Biol. 233, 123-138[CrossRef][Medline] [Order article via Infotrieve]
30. Sugiura, I., Nureki, O., Ugaji-Yoshikawa, Y., Kuwabara, S., Shimada, A., Tateno, M., Lorber, B., Giege, R., Moras, D., Yokoyama, S., and Konno, M. (2000) Structure Fold. Des. 8, 197-208[Medline] [Order article via Infotrieve]
31. Folkers, P. J. M., Folmer, R. H. A., Konings, R. N. H., and Hilbers, C. W. (1993) J. Am. Chem. Soc. 115, 3798-3799
32. Sherman, J. M., Thomann, H. U., and Soll, D. (1996) J. Mol. Biol. 256, 818-828[CrossRef][Medline] [Order article via Infotrieve]
33. Perona, J. J., Rould, M. A., Steitz, T. A., Risler, J. L., Zelwer, C., and Brunie, S. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 2903-2907[Abstract]
34. Spurio, R., Brandi, L., Caserta, E., Pon, C. L., Gualerzi, C. O., Misselwitz, R., Krafft, C., Welfle, K., and Welfle, H. (2000) J. Biol. Chem. 275, 2447-2454[Abstract/Free Full Text]
35. Guenneugues, M., Caserta, E., Brandi, L., Spurio, R., Meunier, S., Pon, C. L., Boelens, R., and Gualerzi, C. O. (2000) EMBO J. 19, 5233-5240[Abstract/Free Full Text]
36. Yusupova, G., Reinbolt, J., Wakao, H., Laalami, S., Grunberg-Manago, M., Romby, P., Ehresmann, B., and Ehresmann, C. (1996) Biochemistry 35, 2978-2984[CrossRef][Medline] [Order article via Infotrieve]
37. Petersen, H. U., Kruse, T. A., Worm-Leonhard, H., Siboska, G. E., Clark, B. F., Boutorin, A. S., Remy, P., Ebel, J. P., Dondon, J., and Grunberg-Manago, M. (1981) FEBS Lett. 128, 161-165[CrossRef][Medline] [Order article via Infotrieve]
38. Petersen, H. U., Siboska, G. E., Clark, B. F., Buckingham, R. H., Hountondji, C., and Blanquet, S. (1984) Biochimie (Paris) 66, 625-630
39. Mechulam, Y., Schmitt, E., Maveyraud, L., Zelwer, C., Nureki, O., Yokoyama, S., Konno, M., and Blanquet, S. (1999) J. Mol. Biol. 29, 1287-1297[CrossRef]
40. Guex, N., and Peitsch, M. C. (1997) Electrophoresis 18, 2714-2723[Medline] [Order article via Infotrieve]


Copyright © 2003 by The American Society for Biochemistry and Molecular Biology, Inc.