The pyruvate formate lyase family: sequences, structures and activation

L. Lehtiö1,2 and A. Goldman2,3

1Graduate School in Informational and Structural Biology and 2Program in Structural Biology and Biophysics, Institute of Biotechnology, University of Helsinki, PO Box 65, FIN-00014 Helsinki, Finland

3 To whom correspondence should be addressed. E-mail: adrian.goldman{at}helsinki.fi


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
We cloned and expressed in Escherichia coli the Archaeglobus fulgidus gene that encodes pyruvate formate lyase 2 (PFL2). PFL2, despite its homology to the other glycyl radical enzymes, differs from them by exhibiting a completely different oligomerization. The most abundant form of PFL2 when expressed in E.coli is a trimer. The closest homologue of PFL2 with a known structure is E.coli PFL, which is a dimer. Sequence comparisons allowed us to reclassify PFL-like enzymes and the consensus sequences allowed us to propose an activation route for PFL-like glycyl radical enzymes. Surprisingly, most of the conserved residues in PFL-like enzymes appear to be involved in preserving the structure, rather than forming the active site.

Keywords: activation/pyruvate formate lyase/radical/trimer


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Genome sequencing projects have revealed that bacterial and archaeal genomes contain glycyl radical enzymes in addition to the already characterized type III ribonucleotide reductases (RNRIII) (Mulliez et al., 1993Go; Young et al., 1996Go), pyruvate formate lyase (PFL) (Knappe et al., 1984Go; Wagner et al., 1992Go), keto acid formate lyase (Hesslinger et al., 1998Go), glycerol dehydratase (Raynaud et al., 2003Go), benzyl succinate synthetase (Leuthner et al., 1998Go) and p-hydroxyphenylacetate decarboxylase (Selmer and Andrei, 2001Go). For instance, Archaeglobus fulgidus appears to contain a new type of glycyl radical enzyme, pyruvate formate lyase 2 (PFL2) (Klenk et al., 1997Go). The closest homologue to PFL2 of the known glycyl radical enzymes is glycerol dehydratase, but the most homologous enzyme with a known structure is PFL (Becker et al., 1999Go; Leppänen et al., 1999Go).

PFLs (like RNRs) (e.g. PDB i.d.s 3PFL, 4R1R) contain a 10-stranded {alpha}/ß-barrel consisting of two sets of five parallel {alpha}/ß-units assembled in an antiparallel manner. Inside the barrel is a loop containing the active-site cysteines. The barrel also has a C-terminal loop containing Gly734, which is converted to a glycyl radical when the enzyme is activated. All known glycyl radical enzymes are dimers (Conradt et al., 1984Go; Ollagnier et al., 1996Go; Leuthner et al., 1998Go; Selmer and Andrei, 2001Go; O'Brien et al., 2004Go) and at least PFL and RNRIII display half-the-sites reactivity (Unkrig et al., 1989Go; Young et al., 1996Go), so that only one of the glycines of the dimer is in the radical state at a time. This radical is regenerated in each reaction cycle. All the structures solved so far are of the inactive non-radical enzyme and therefore do not reflect the picture of an enzyme in action, because of the conformational changes during activation (discussed by Lehtiö et al., 2002Go). The change upon activation is unprecedented in enzyme mechanism, because the glycine goes from sp3 to sp2 hybridization. This conformational change during activation makes the other monomer inactivatable in ways not understood. The radical state has so far not been stable enough for structural analysis.

Glycyl radical enzymes are activated by specific activating enzymes, which belong to the newly established radical SAM superfamily (Sofia et al., 2001Go). These proteins generate radical species by reductive cleavage of S-adenosylmethionine (SAM) through an unusual Fe–S center. Radical SAM proteins catalyse a variety of reactions, ranging from unusual methylations to protein radical formation. Recent crystal structures have revealed the structural frameworks for radical SAM proteins (Layer et al., 2003Go; Berkovitch et al., 2004Go). In A.fulgidus there are several activating enzyme homologues, but one of them has a coding sequence that overlaps with the gene encoding PFL2 (gene AF1449). In order to expand our knowledge of glycyl radical enzymes, we decided to clone and express PFL2 from this anaerobic thermophilic sulfate reducing archaeon. We reasoned that, being a thermophilic protein, its glycyl radical might be more stable at ambient temperature than in the Escherichia coli counterpart. We found that PFL2, unlike all other glycyl radical enzymes, is a homotrimer and this prompted us to reanalyse and reclassify PFL-like sequences.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Cloning

The gene for PFL2 was PCR-cloned from the A.fulgidus genome (genomic DNA was a gift from Professor Nils-Kåre Birkeland) into the SmaI restriction site of the pQE-30 vector (Qiagen), digested with BamHI and religated to bring the gene in frame with the histidine tag (primer sequences are availabe as Supplementary data at PEDS online). The untagged construct was made from the tagged vector by inverse polymerase chain reaction (PCR). Ligation products were transformed to TOP10 cells (Qiagen) and the transformants were screened with direct PCR from colonies with the sequencing primers and with restriction digestions. The sequences of the constructs were then verified.

Protein expression and purification

The plasmids were transformed to the Rosetta(DE3)pLacI strain (Novagen). All cell cultures were grown in LB medium supplemented with 100 µg/ml ampicillin and 17 µg/ml chloramphenicol. Cultures of 400 ml were inoculated with 2 ml of an o/n preculture and grown at 37°C in a 1 l Erlenmeyer flask shaken at 225 r.p.m. The culture was grown to an OD600 of 0.7–0.8 and induced with 0.2 mM IPTG. Cells were then grown to late log-phase, which took about 3 h, and collected by centrifugation at 7000 r.p.m. for 15 min. The cell pellet was subsequently resuspended in 10 ml of buffer (20 mM Tris–HCl pH 7.5, 100 mM NaCl), pelleted again at 4000 r.p.m. for 15 min and stored at –20°C for further use.

A 10 ml volume of lysis buffer containing 50 mM MES pH 7.0, 5 mM DTT, 5 mM EDTA, 0.1 mM PMSF and 5% glycerol was added to the frozen cells (~1.5 g). Cells were then thawed, placed in an ice–water bath and lysed by sonication (Labsonic U, B. Braun) using a T 12 probe twice for 15 min with a sonication:rest cycle of 1:1. Crude lysate was centrifuged at 15 000 r.p.m. for 15 min. The cleared lysate (~10.5 ml) was then diluted to 30 ml with buffer (20 mM Bis-Tris pH 6.4, 1 mM DTT) and loaded on an anion-exchange column (HiLoad Q Sepharose FF 16/10, Pharmacia). Proteins were eluted from the column operated at 2 ml/min with an increasing NaCl gradient from 0 to 500 mM. PFL2 eluted as a broad peak (50 ml) with a maximum at ~360 mM NaCl. The fractions containing PFL2, as judged by SDS–PAGE, were combined and concentrated to ~1 ml using a Centriprep-30 concentrator (Amicon). The concentrated protein solution was then gel filtered with Superdex-200 medium (26/60, Pharmacia) equilibrated with a buffer containing 20 mM Tris–HCl pH 7.5, 100 mM NaCl and 1 mM DTT. Most of the PFL2 eluted in one peak (9 ml) and that peak was concentrated to ~600 ml and at the same time the buffer was changed to 10 mM Tris–HCl pH 7.5 and 1 mM DTT by multiple concentration and dilution steps. The His-tagged protein was also purified, but it tended to precipitate and resulted in three peaks of equal size in gel filtration following the Ni-NTA column, so we decided to continue with the untagged version.

The purity of the protein was analysed by SDS–PAGE and the concentrations were measured with a Bio-Rad kit (Bradford, 1976Go). Concentrated protein samples were either stored at 4°C or frozen in liquid nitrogen and stored at –80°C. The molecular weight of PFL2 was confirmed by electrospray mass spectrometry.

Dynamic light scattering

The concentrated protein samples were diluted to 10 mg/ml in buffer (10 mM Tris–HCl pH 7.5, 100 mM NaCl and 1 mM DTT) and a 20 µl sample was injected on to an S-200 10/30 column (Pharmacia) on an HPLC system (Waters). During the run, absorbance at 280 nm, refractive index and light scattering were monitored with a PD2000DLS detector (Precision Detectors). Refractive index and 90° scattering were used to calculate the molecular weight of the eluted peaks based on calibration with bovine serum albumin.

Database searches

The amino acid sequences of E.coli PFL and A.fulgidus PFL2 were subjected to a psi-blast search at the NCBI server (Altschul et al., 1997Go). The sequences found after five iterative runs were then aligned with ClustalW (Thompson et al., 1994Go). The C-terminal part of some of the sequences (group SHORT in Figure 3) had to be adjusted manually, in order to align the glycine loops. Distance matrices were calculated with Protdist and the phylogenetic tree was constructed with Quicktree (Phylib) (Felsenstein, 1989Go). Sequence homology was illustrated with the help of Treeview (Page, 1996Go) and Genedoc (Nicholas and Nicholas, 1997Go). A total of 113 sequences were used in the final analysis.



View larger version (38K):
[in this window]
[in a new window]
 
Fig. 3. Phylogenetic tree plotted with Treeview (Page, 1996Go). The sequence groups used for the analysis are coloured differently. CCPFL contains enzymes with two adjacent cysteines in the active site. BSS contains benzyl succinate synthetase homologues. GD contains glycerol dehydratase as well as A.fulgidus PFL2. The sequences in CPFL and SHORT are all of unknown function. The individual sequences are labelled with the NCBI gene identifier number. The proteins discussed in the article are marked with a label next to the sequence i.d.: PFL is pyruvate formate lyase, TdcE is keto acid formate lyase, GDH is glycerol dehydratase, PFL2 is A.fulgidus PFL2, TutD is benzyl succinate synthetase and HpdB is p-hydroxyphenylacetate decarboxylase. The horizontal scale bar at bottom of the image defines the evolutionary distance corresponding to 0.1 mutations per residue.

 

    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Properties of PFL2

Expression levels of PFL2 were very good in the Rosetta system and the recombinant protein concentration in cells was estimated to be close to 10% (8 mg yield from 400 ml of culture; Figure 1). We used both SDS–PAGE and mass spectrometry to check the protein purity and to confirm that the protein was not partly cleaved by activation, as occurs in E.coli PFL (the cleavage is due to residual activity of the E.coli activating enzyme). According to SDS–PAGE the protein is about 5 kDa smaller than expected, but the electrospray mass spectrometry result (87 080 Da) is consistent with the calculated mass without the first methionine (87 035 Da). On native PAGE run at pH 8.8 the protein localized in three distinct bands (Figure 1), but when it was run at pH 7.5, most of the protein was in a single band. The oligomerization state of the protein was unclear, so we measured dynamic light scattering coupled to a gel filtration column (Figure 2). It is evident that PFL2 is almost completely in the trimeric form under the conditions used (20 mM Tris–HCl pH 7.5, 100 mM NaCl, 1 mM DTT) with only a small amount of monomer. According to DLS the monomer is ~95 kDa in contrast to the actual size of 87 kDa, which is due the relatively larger hydrodynamic radius of PFL2 in contrast to BSA (used for calibration). The trimer:monomer ratio is approximately 19:1 as estimated from the area under the peaks. As the sample was not centrifuged before analysis, there is a small amount of aggregate, which gives a large peak in the 90° scattering.



View larger version (109K):
[in this window]
[in a new window]
 
Fig. 1. (A) Reducing SDS–PAGE from the samples taken during expression and purification of PFL2 The samples are from uninduced culture, induced culture, cleared lysate, preparation after anion exchange and after gel filtration, followed by the molecular weight standard. The gel was overloaded in order to show the impurities left at each step. (B) Native PAGE run at pH 8.8, showing the three distinct bands.

 


View larger version (11K):
[in this window]
[in a new window]
 
Fig. 2. Dynamic light scattering analysis of size-exclusion chromatography. (A) Normalized curves of absorbance (red), refractive index (blue) and light scattering at 90° (green) monitored during gel filtration. Peak (a) is due to a small amount of large aggregates, peak (b) is a monomodal peak that corresponds to the trimeric 300 kDa form of the protein and peak (c) is a monomodal monomer peak. (B) Derived molecular weight distribution plotted against the relative abundance in the sample calculated by the PrecisionAnalyze program (version 4.10.007, Precision Detectors). Peaks (d) and (e) correspond to monomer and trimer forms, respectively.

 
Sequence searches

The sequences were searched against the A.fulgidus PFL2 sequence and used in the comparison if they contained a sequence with the active-site glycine as defined by the alignment with the A.fulgidus PFL2 sequence. The sequence similarity searches revealed a lot of short sequences (81–200 amino acids) homologous only to the C-terminal part of PFLs. The proteins came from various prokaryotes and bacteriophages and are very homologous to the acid-induced protein yfiD in E.coli, which can repair PFL that has been inactivated by dioxygen by acting as a replacement part for the C-terminus (Wagner et al., 2001Go). These sequences and a partial sequence of Geobacter metallireducens benzylsuccinate synthetase were excluded from further analysis (Kane et al., 2002Go).

Two PFL-like proteins that were significantly longer were also excluded. The hypothetical protein from Ralstonia solanacearum (1351 aa; NCBI gi:17546265) (Salanoubat et al., 2002Go) is a fusion of a PFL-like protein and a Dyp-type peroxidase and since it is the only one of this kind we excluded it as it may be due to an error in genome sequencing. A hypothetical protein found in the pathogenicity island of uropathogenic E.coli (1140 aa; NCBI gi: 23954275; Dobrindt et al., 2002Go) is a partial N-terminal duplication of a PFL-like protein and was similarly excluded. The following discussion therefore focuses on PFL-like proteins that are of roughly the same size as PFL2 and contain both the glycine loop sequence and at least one cysteine in the hypothetical active site. The sequence identities and similarities between the sequences and A.fulgidus PFL2 ranges from 6 to 34% and from 16 to 54%, respectively. It is remarkable that none of the ribonucleotide reductases were found to be homologous to PFL2. Even though the solved structures are similar, there is no overall sequence homology between these two protein families (Leppänen et al., 1999Go).

To analyse the conservation and differences between groups, we assigned the sequences into six groups according to the phylogenetic tree (Figure 3). We decided to leave p-hydroxyphenylacetate decarboxylase, in addition to sequences from Bacteroides thetaiomicron and from an anaerobic lithoautotrophic thermophilic archaeon, Methanothermobacter thermoautotrophicus, out of the grouping, because they clearly did not belong to any other groups, but rather formed groups of only one sequence. In the group where all the sequences have two adjacent cysteine residues in the active site (CCPFL), the sequences from Streptococcus and Lactococcus form a branch of their own, but they are still clearly part of that group. A.fulgidus PFL2, the first archaeal PFL-like enzyme purified, is most similar to the glycerol dehydratase group (GD), although it is clearly an outlier in that group.

Sequence alignment (see Supplementary data) and phylogenetic analysis (Figure 3) showed that the CCPFLs formed a distinct family, containing all the enzymes that have been shown to catalyse the cleavage of pyruvate. This is not surprising since it has been shown that both of the active-site cysteines are necessary for the catalytic cycle of PFL (Knappe et al., 1993Go). On the other hand, it suggests that all the other PFL-like enzymes found may have a different enzymatic activity. Thirteen CPFL sequences from Clostridium, Escherichia, Haemophilus, Salmonella and Vibria form a separate group among CPFLs (labelled SHORT) and probably share a common enzymatic activity, especially as they are all significantly shorter than the others, being only ~500 residues in length. This group forms a new, previously unidentified, type of glycyl radical enzymes.

Consensus sequences

We extracted consensus sequences from each group that we defined by the phylogenetic analysis and used those for comparisons between groups (Figure 3). There were two reasons for doing this. First, the groups contain different numbers of sequences and presumably have diverged for different lengths of time. As a result, sequence conservation provides a less stringent measure of the key residues in some groups than in others. Second, the overall consensus sequence for all the 113 sequences (Figure 4, top line) is dominated by the CCPFL family, as half of the sequences belong to this group. To reduce this bias, we used both the overall consensus and ‘consensus of consensus’ approach as in Glatigny and Scazzocchio (1995)Go, indicated by shading in Figure 4.



View larger version (74K):
[in this window]
[in a new window]
 
Fig. 4. Alignment of the consensus sequences of the groups described in Figure 3, including the ones left out from the grouping. The threshold for conservation was 70%. A capital letter indicates that all the sequences within a group contain the same residue at a certain position; a lower-case letter indicates that >70% of the sequences have the same residue. Similarity groups are marked according to the Blosum35 matrix: 2 = NQ, 3 = ST, 4 = KR, 5 = FYW, 6 = LIVM and 7 = DE. For similarity groups, the number is black if all sequences within a group contain a similar type of residue and grey if >70% do. Shading: black if all the consensus sequences contain a similar type of a residue and grey if more than three out of five contain a similar type of residue. If the residue is not conserved, it is marked with a dash and if there is a gap in the consensus sequence alignment it is marked with a dot. The last sequence is E.coli PFL (EcPFL). The residues contributing to the monomer–monomer interface are underlined. The residues forming the binding site for the coenzyme A and pyruvate are marked with # and ¤, respectively. The top sequence is the overall consensus sequence from all the protein sequences shown in Figure 3. The numbering is according to E.coli PFL and the conserved pieces of sequence discussed in text are marked with a line next to the sequence numbering.

 
Overall conservation patterns

The members of the BSS, GD and CPFL groups are fairly homologous to each other in the N-terminal region, but not particularly to the other sequences (this is the first 370 positions in the alignment, corresponding to 250 residues in EcPFL numbering, as is used below). It encompasses the N-terminal domain (1–175) that wraps around the PFL barrel in CCPFLs and makes most of the contacts between the monomers of the dimer, in addition to the first strand of the barrel. There are 43 positions that show conservation in this region and 26 of those are only conserved among BSS, GD and CPFL groups (Figure 4); in the N-terminal domain 19/28 positions are conserved only among these three groups. This similarity suggests, first, that BSS, GD and CPFLs all have a similar N-terminal domain and, second, that the N-terminal domain may be significantly different to that in CCPFLs. This could affect oligomerization, as we have shown here for PFL2. Supporting this notion, the first strand of the barrel following the N-terminal domain is highly conserved in CCPFLs, but not in the other sequences.

The first absolutely conserved residue, Ala254 (EcPFL numbering), is at the beginning of an {alpha}-helix preceding the second strand of the barrel. In other words, the variable N-terminal region in PFL-like enzymes covers residues 1–250, including the first {alpha}/ß-motif of the barrel. The lack of N-terminal conservation also suggests that the activating enzyme interacts mainly with other areas of the PFL-like proteins.

The conserved amino acids occur primarily in regions (labelled A–P, Figures 4 and 5A), rather than being scattered through the sequence. As expected, the two active site loops, the cysteine loop G413–420 and the glycine loop O728–737, are conserved (see below). In addition, regions that buttress the active site are also conserved (F369–375, N704–710, M640–654 and P747–754). However, most (10/16) of the conserved regions appear not to be directly involved in catalysis. They are parts of helices (8/16: A252–261, B277–286, C303–311, E348–357, H437–449, J535–544, K564–577), part of a strand (D337–340) and two loops, L619–623 and I511–518. The former interacts with conserved helix regions H and J, while the latter is between the barrel and the two long helices that form the monomer–monomer interface in EcPFL. It may therefore be involved in allosteric signalling, although the allosteric behaviour of a trimer such as PFL2 is likely to be significantly different than that of a dimer.



View larger version (57K):
[in this window]
[in a new window]
 
Fig. 5. Conserved sequence pieces plotted on to EcPFL structure, shown in stereo. (A) Overall view, with helices as spirals and strands as arrows. A translucent molecular surface area is also shown. Conserved sequences are coloured blue and the conserved finger loops are coloured red. CoA is shown as a green stick model on the surface of the protein. (B) Elements interacting with the glycine loop and possibly contributing to the activation reaction. Colouring scheme as in (A); conserved regions (M, P, N and F) around the red finger loops are coloured blue. A glycine- and proline-rich region that would contribute to the specificity of activation is shown and conserved G632 and P635 are coloured blue. The residues discussed in the text are shown and C{alpha}s of the glycines are shown as spheres. Water molecules between helix P and glycine loop are shown as blue spheres. (C) Interactions of the conserved asparagines 706 and 708 with the glycine loop. Figures were generated from coordinates stored at the protein data bank with i.d. 1H16 (Becker and Kabsch, 2002Go) with Pymol (Delano, 2002Go).

 
Apart from the active-site region (discussed below), most of the residues conserved in all the analysed sequences (Figure 4) appear to have a role in hydrophobic packing (Figure 5A). This is unusual, as the active site is normally the most conserved part of a protein and typically includes hydrophilic/charged residues (Baykov et al., 1999Go), but here it appears that this is not the case. It may be that the determinants of the RNR fold are the most conserved. For instance, the absolutely conserved Pro369 is the last one of the Pro–Ser–Pro–Glu–Pro loop (EcPFL) that helps the chain to dive into the {alpha}/ß-barrel after an {alpha}-helix to form a barrel strand.

Active site conservation

The only absolutely conserved residue in the active site apart from Arg731 and Gly734 in the glycine loop is Cys419. It is close to Gly734 and it clearly performs the radical chemistry in most, if not all, of these enzymes. There are, however, a few isolated highly conserved residues centred around the glycine loop. In the P region, conserved Arg753 is hydrogen bonded to the backbone of Val732, whereas the following residue, Thr754 (mostly conserved), is hydrogen-bonded to the absolutely conserved Asp640 in region M (Figures 3 and 5B). The rest of the contacts with the glycine loop on the helix-P side are surprisingly made via a layer of water molecules in the inactive PFL (Figure 5B). This unusual feature may make it easier for the helix to open and thus enable activation. Another element possibly affecting the activation process and its specificity is the 626–638 loop (Figure 5B) positioned next to the water layer and helix P. It contains three prolines and two glycines in EcPFL, including the highly conserved Gly632 and Pro635. This loop, helix P and Asp640 might together be responsible for the conformational change during the activation process. The surface and the specific interactions just beneath the surface would give a common yet specific basis for the activation of these radical enzymes.

The glycine loop is anchored in position on the ‘underneath’ (Figure 5B) by interactions with the residues of the barrel strands without any intervening water layer. Conserved strand N704–710 provides ionic interactions to guide the loop into the proper orientation once the radical has been formed (Figure 5B and C). The interactions are made by conserved Asn/Gln706 and Asn708 to Arg731. Asn708 N{gamma} hydrogen bonds to the backbone of Arg731, while Asn706 O{gamma} makes hydrogen bonds to two different guanidinium Ns (Figure 5C). Arg731 also hydrogen bonds to the side chains of Gln398 (conserved in CCPFLs), Asn370 (somewhat conserved) and to the backbone carbonyl of Ser396 (Figure 5B and C). Consequently, the side chains of Arg731 and Asn706 are precisely positioned by a dense network of hydrogen bonds. The whole arrangement appears to stabilize an unusual interaction between the Asn706 N{gamma} and the main-chain nitrogen of Gly734, which are within hydrogen bonding distance (2.9 Å; Figure 5C), but cannot be hydrogen-bonding partners. This region seems to be important, since both Asn706 and Asn708 are almost completely conserved.

In the binding pocket itself, the residues that contribute to the binding of pyruvate and CoA to inactive PFL (Figure 4) (Becker and Kabsch, 2002Go; Lehtiö et al., 2002Go) are only conserved within the CCPFL group, consistent with the wide variety of substrates and reactions found in PFL-like enzymes. The only pyruvate-binding residues that show some conservation between groups are aromatic residues at positions 327 and 333 that form hydrophobic interactions to the substrate. In EcPFL Trp333, the conserved hydrophobic Ile606 and a conserved aromatic Tyr735 form a hydrophobic barrier between the binding site of pyruvate and Gly734. Conversely, conserved F327 forms part of hydrophobic substrate binding pocket opposite Cys418.

Correlated active-site conservation

Cysteine 418 is required for the activity of PFL and has been proposed to be the thiyl radical that attacks pyruvate (Becker and Kabsch, 2002Go), but it is not conserved in PFL-like proteins. Residue 418 is, however, absolutely conserved within each group. In the CCPFL group, it is cysteine; in SHORT, it is serine; in CPFLs, glycine and in the BSSs, it is leucine or glycine. Another residue that is highly, but differently, conserved is 733. In CCPFL it is serine; in SHORT it is threonine; in GD and CPFL it is alanine and in BSS, it is either serine or alanine. This suggests that the enzymes in the BSS group actually perform different functions. According to the phylogenetic tree, this group could also have been divided into two subgroups (Figure 3). This differential conservation in two positions close to the active site also suggests that these residues have significance in each group and are somehow utilized by the catalytic machinery; one requires Leu–Ser whereas the other requires Gly–Ala.


    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
In inactive PFL, Cys418 is bound to the substrate, pyruvate, and we prefer the view that this is the binding site that enables activation of the enzyme (Lehtiö et al., 2002Go), since pyruvate or an inert analogue oxamate, is required for the activation reaction. This idea is further supported by the sequence alignment since Cys419, not Cys418, is conserved as part of the general radical machinery of these enzymes. This leaves positions 418 and 733 as the only active-site residues that clearly vary by family, as the others are either completely conserved (Cys419, Arg731, Gly734) or else vary within the families. Cys418 and Ser733 in EcPFL, however, lie sufficiently far away from each other (C{alpha}–C{alpha} distance 10.7 Å) and hence are unlikely to contribute to the same binding site. One possibility is that position 418 varies between families because of substrate binding to the inactive enzyme and 733 because of the radical chemistry that the enzyme performs once activated.

Recently published crystal structures of the radical SAM enzymes biotin synthase and coproporforinogen III oxidase (Layer et al., 2003Go; Berkovitch et al., 2004Go), are consistent with our earlier proposal (Lehtiö et al., 2002Go) that the glycine loop will be pulled out of the core of PFL for activation. In biotin synthase and coproporforinogen III oxidase, the radical is transferred to a substrate positioned next to the cleaved SAM cofactor. Analogously, the glycine loop as a substrate would also bind close to that site. This is consistent with the fact that heptapeptides homologous to the glycine loop induce cleavage of SAM (Frey et al., 1994Go); a linear sequence, rather than structure, appears to be required for glycyl radical generation.

Since the conserved Arg731 is part of the PFL core and is also required for the radical generation in peptide studies (Frey et al., 1994Go), how does the reaction stay specific to PFL? We propose that the surface covering the glycine loop forms a molecular recognition surface that differs from PFL family to PFL family. Only the correct AE is able to bind to the surface and expose the glycine loop. This is consistent with the fact that the surface near the actual site is rather conserved within CCPFLs, but not within different PFL-like enzymes. Consistent with this view, E.coli AE can activate Streptococcus bovis PFL (Asanuma and Hino, 2000Go), but (see above) PFL2 is not activated by E.coli AEs. This further supports the view that PFL2 is not a PFL (Figure 3). Indeed, the sequence identity between PFL2 and the recently-solved GD (O'Brien et al., 2004Go) is 32%, whereas the sequence identity between PFL2 and the next most similar enzyme of known function (BSS) is only 24%. It is therefore tempting to reclassify PFL2 as a GD—but only 5/8 of the active-site residues that bind glycerol are conserved. Finally, O'Brien et al. (2004)Go describe a dimeric enzyme, whereas PFL2 is clearly trimeric.

An unusual feature of both anaerobic RNR and PFL-like enzymes is a conserved polar residue close to the backbone amide of the activated glycine. In the PFL-like enzymes the residue, EcPFL Asn706, is either Asn or Gln, whereas in RNRs it is Asp (Logan et al., 1999Go). The reason for this differential conservation might be in the differences in the activation reaction or in the catalytic reaction itself. Logan et al. (1999)Go suggested that ‘Glu446 may participate in fine-tuning the position of the glycyl radical relative to Cys290 in response to substrate binding, which would thus act as a trigger for Cys radical formation’. However, Asn706 cannot accept a hydrogen bond from the glycine amide. One possibility is that, after activation, Asn706 would provide additional stabilizing force to the glycyl radical if, for instance, the radical was deprotonated. Differences in the activation reaction are significant in RNRs and PFLs, since the activation of an anaerobic RNR might be mediated by an additional metal binding site located at the catalytic subunit (Logan et al., 2003Go), whereas this additional cluster does not exist in PFL.

All the known glycyl radical enzymes are dimeric with additional subunits, activating enzymes or both. PFL2 from A.fulgidus is the first one studied which breaks this rule by forming a homotrimer. This suggests at the very least a different activation stoichiometry. Based on our overexpression cultures, neither E.coli AE nor some other activating enzyme from E.coli can activate PFL2. So far we have been unable to overexpress the AfAE2 in soluble form and therefore we cannot yet confirm how or under what conditions PFL2 is activated. However, we have recently obtained crystals of PFL2, so the geometry of the enzyme, possible substrate and mechanisms of activation should become clear.


    Acknowledgments
 
The authors acknowledge Michael Merckel for help and discussions, Dr Ingemar von Ossowski for advice with cloning and Dr Roman Tuma for help with DLS. They also acknowledge Dr Nils-Kåre Birkeland for the A.fulgidus genome preparation. This work was supported by the National Graduate School in Informational and Structural Biology, by Academy of Finland grants 168155 and 172618 and by the Sigrid Juselius foundation.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389–3402.[Abstract/Free Full Text]

Asanuma,N. and Hino,T. (2000) Appl. Environ. Microbiol., 66, 3773–3777.[Abstract/Free Full Text]

Baykov,A.A., Cooperman,B.S., Goldman,A. and Lahti,R. (1999) Prog. Mol. Subcell. Biol. 23, 127–150.[Medline]

Becker,A. and Kabsch,W. (2002) J. Biol. Chem., 277, 40036–40042.[Abstract/Free Full Text]

Becker,A., Fritz-Wolf,K., Kabsch,W., Knappe,J., Schultz,S. and Volker Wagner,A.F. (1999) Nat. Struct. Biol., 6, 969–975.[CrossRef][ISI][Medline]

Berkovitch,F., Nicolet,Y., Wan,J.T., Jarrett,J.T. and Drennan,C.L. (2004) Science, 303, 76–79.[Abstract/Free Full Text]

Bradford,M.M. (1976) Anal. Biochem., 72, 248–254.[CrossRef][ISI][Medline]

Conradt,H., Hohmann-Berger,M., Hohmann,H.P., Blaschkowski,H.P. and Knappe,J. (1984) Arch. Biochem. Biophys., 228, 133–142.[ISI][Medline]

Delano,W.L. (2002) The PyMOL Molecular Graphics System, San Carlos, CA. www.pymol.org.

Dobrindt,U., Blum-Oehler,G., Nagy,G., Schneider,G., Johann,A., Gottschalk,G. and Hacker,J. (2002) Infect. Immun., 70, 6365–6372.[Abstract/Free Full Text]

Felsenstein,J. (1989) Cladistics, 5, 164–166.

Frey,M., Rothe,M., Wagner,A.F. and Knappe,J. (1994) J. Biol. Chem., 269, 12432–12437.[Abstract/Free Full Text]

Glatigny,A. and Scazzocchio,C. (1995) J. Biol. Chem., 270, 3534–3550.[Abstract/Free Full Text]

Hesslinger,C., Fairhurst,S.A. and Sawers,G. (1998) Mol. Microbiol., 27, 477–492.[CrossRef][ISI][Medline]

Kane,S.R., Beller,H.R., Legler,T.C. and Anderson,R.T. (2002) Biodegradation, 13, 149–154.[CrossRef][ISI][Medline]

Klenk,H.P. et al. (1997) Nature, 390, 364–370.[CrossRef][ISI][Medline]

Knappe,J., Neugebauer,F.A., Blaschkowski,H.P. and Ganzler,M. (1984) Proc. Natl Acad. Sci. USA, 81, 1332–1335.[Abstract]

Knappe,J., Elbert,S., Frey,M. and Wagner,A.F. (1993) Biochem. Soc. Trans., 21, 731–734.[ISI][Medline]

Layer,G., Moser,J., Heinz,D.W., Jahn,D. and Schubert,W.D. (2003) EMBO J., 22, 6214–6224.[Abstract/Free Full Text]

Lehtiö,L., Leppänen,V.M., Kozarich,J.W. and Goldman,A. (2002) Acta Crystallogr. D, 58 (Pt 12), 2209–2212.[CrossRef][ISI][Medline]

Leppänen,V.M., Merckel,M.C., Ollis,D.L., Wong,K.K., Kozarich,J.W. and Goldman,A. (1999) Struct. Fold. Des., 7, 733–744.[CrossRef][ISI][Medline]

Leuthner,B., Leutwein,C., Schulz,H., Horth,P., Haehnel,W., Schiltz,E., Schagger,H. and Heider,J. (1998) Mol. Microbiol., 28, 615–628.[CrossRef][ISI][Medline]

Logan,D.T., Andersson,J., Sjoberg,B.M. and Nordlund,P. (1999) Science, 283, 1499–1504.[Abstract/Free Full Text]

Logan,D.T., Mulliez,E., Larsson,K.M., Bodevin,S., Atta,M., Garnaud,P.E., Sjoberg,B.M. and Fontecave,M. (2003) Proc. Natl Acad. Sci. USA, 100, 3826–3831.[Abstract/Free Full Text]

Mulliez,E., Fontecave,M., Gaillard,J. and Reichard,P. (1993) J. Biol. Chem., 268, 2296–2299.[Abstract/Free Full Text]

Nicholas,K.B. and Nicholas,N.B.,Jr (1997) Genedoc: a tool for editing and annotating multiple sequence alignments. Distributed by the authors.

O'Brien,J.R., Raynaud,C., Croux,C., Girbal,L., Soucaille,P. and Lanzilotta,W.N. (2004) Biochemistry, 43, 4635–4645.[CrossRef][ISI][Medline]

Ollagnier,S., Mulliez,E., Gaillard,J., Eliasson,R., Fontecave,M. and Reichard,P. (1996) J. Biol. Chem., 271, 9410–9416.[Abstract/Free Full Text]

Page,R.D. (1996) Comput. Appl. Biosci., 12, 357–358.[Medline]

Raynaud,C., Sarcabal,P., Meynial-Salles,I., Croux,C. and Soucaille,P. (2003) Proc. Natl Acad. Sci. USA, 100, 5010–5015.[Abstract/Free Full Text]

Salanoubat,M. et al. (2002) Nature, 415, 497–502.[CrossRef][ISI][Medline]

Selmer,T. and Andrei,P.I. (2001) Eur. J. Biochem., 268, 1363–1372.[Abstract/Free Full Text]

Sofia,H.J., Chen,G., Hetzler,B.G., Reyes-Spindola,J.F. and Miller,N.E. (2001) Nucleic Acids Res., 29, 1097–1106.[Abstract/Free Full Text]

Thompson,J.D., Higgins,D.G. and Gibson,T.J. (1994) Nucleic Acids Res., 22, 4673–4680.[Abstract]

Unkrig,V., Neugebauer,F.A. and Knappe,J. (1989) Eur. J. Biochem., 184, 723–728.[Abstract]

Wagner,A.F., Frey,M., Neugebauer,F.A., Schafer,W. and Knappe,J. (1992) Proc. Natl Acad. Sci. USA, 89, 996–1000.[Abstract]

Wagner,A.F., Schultz,S., Bomke,J., Pils,T., Lehmann,W.D. and Knappe,J. (2001) Biochem. Biophys. Res. Commun., 285, 456–462.[CrossRef][ISI][Medline]

Young,P., Andersson,J., Sahlin,M. and Sjoberg,B.M. (1996) J. Biol. Chem., 271, 20770–20775.[Abstract/Free Full Text]

Received July 12, 2004; accepted July 13, 2004.

Edited by Joel Sussman