Reconstruction of the evolutionary history of the LexA-binding sequence
Gerard Mazón1,
Ivan Erill2,
Susana Campoy3,
Pilar Cortés1,
Evelyne Forano4 and
Jordi Barbé1,3
1 Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
2 Biomedical Applications Group, Centro Nacional de Microelectrónica, 08193 Bellaterra, Spain
3 Centre de Recerca en Sanitat Animal (CReSA), 08193 Bellaterra, Spain
4 Unité de Microbiologie, INRA, Centre de Recherches de Clermont-Ferrand-Theix, 63122 Saint-Genès-Champanelle, France
Correspondence
Jordi Barbé
jordi.barbe{at}uab.es
 |
ABSTRACT
|
---|
In recent years, the recognition sequence of the SOS repressor LexA protein has been identified for several bacterial clades, such as the Gram-positive, green non-sulfur bacteria and Cyanobacteria phyla, or the Alphaproteobacteria, Deltaproteobacteria and Gammaproteobacteria classes. Nevertheless, the evolutionary relationship among these sequences and the proteins that recognize them has not been analysed. Fibrobacter succinogenes is an anaerobic Gram-negative bacterium that branched from a common bacterial ancestor immediately before the Proteobacteria phylum. Taking advantage of its intermediate position in the phylogenetic tree, and in an effort to reconstruct the evolutionary history of LexA-binding sequences, the F. succinogenes lexA gene has been isolated and its product purified to identify its DNA recognition motif through electrophoretic mobility assays and footprinting experiments. After comparing the available LexA DNA-binding sequences with the F. succinogenes one, reported here, directed mutagenesis of the F. succinogenes LexA-binding sequence and phylogenetic analyses of LexA proteins have revealed the existence of two independent evolutionary lanes for the LexA recognition motif that emerged from the Gram-positive box: one generating the Cyanobacteria and Alphaproteobacteria LexA-binding sequences, and the other giving rise to the F. succinogenes and Myxococcus xanthus ones, in a transitional step towards the current Gammaproteobacteria LexA box. The contrast between the results reported here and the phylogenetic data available in the literature suggests that, some time after its emergence as a distinct bacterial class, the Alphaproteobacteria lost its vertically received lexA gene, but received later through lateral gene transfer a new lexA gene belonging to either a cyanobacterium or a bacterial species closely related to this phylum. This constitutes the first report based on experimental evidence of lateral gene transfer in the evolution of a gene governing such a complex regulatory network as the bacterial SOS system.
Abbreviations: EMSA, electrophoresis mobility-shift assay; GST, glutathione S-transferase; LGT, lateral gene transfer
 |
INTRODUCTION
|
---|
Preservation of genetic material is one of the most fundamental functions of any living being, and it is perhaps in the Domain Bacteria where this aspect has been most thoroughly studied. As in the case of many other biological processes, Escherichia coli has been the principal subject of this research, and many E. coli genes involved in preservation of genetic material have been identified through the years. Some of them encode proteins that are able to repair different types of DNA damage, whilst others aim at guaranteeing cell survival in the presence of such lesions. Many of these genes act in a coordinated manner, constituting specific DNA repair networks, and the broadest and the most thoroughly studied of these regulons is the LexA-mediated SOS response (Walker, 1984
). In E. coli, the LexA protein controls the expression of some 40 genes (Fernández de Henestrosa et al., 2000
; Courcelle et al., 2001
), including both the lexA and recA genes, which are, respectively, the negative and positive regulators of the SOS response (Walker, 1984
). The E. coli LexA protein specifically recognizes and binds to an imperfect 16 bp palindrome with consensus sequence CTGTN8ACAG, designated the E. coli SOS or LexA box (Walker, 1984
). Both in vitro and in vivo experiments have shown that binding to single-stranded DNA fragments, generated by DNA damage-mediated inhibition of replication, activates the RecA protein (Sassanfar & Roberts, 1990
). Once in its active state, RecA promotes the autocatalytic cleavage of LexA, resulting in the expression of the genes regulated by this repressor (Little, 1991
). Hydrolysis of the E. coli LexA protein is mediated by its Ser119 and Lys156 residues, in a mechanism similar to that of proteolysis by serine proteases (Luo et al., 2001
). After DNA repair, the RecA protein ceases to be activated and, consequently, non-cleaved LexA protein returns to its usual levels, repressing again the genes that are under its direct negative control.
Even though some notable exceptions have been reported, the increasing availability of microbial genome sequences has revealed that LexA is present in many bacterial species and in most phyla. So far, all the identified and characterized LexA proteins display two conserved domains that are clearly differentiated. The N-domain, ending at the Ala-Gly bond where the protein is cleaved after DNA damage activation of RecA (Little, 1991
), has three
helices that are necessary for the recognition and binding of LexA to the SOS box (Fogh et al., 1994
; Knegtel et al., 1995
). Conversely, the C-domain contains amino acids that are essential for the serine-protease-mediated auto-cleavage and for the dimerization process necessary for repression (Luo et al., 2001
).
The sequence of the LexA box is strongly conserved among related bacterial species. In fact, the LexA box has been shown to be monophyletic for several bacterial phyla, and this feature has been successfully exploited in phylogenetic analyses (Erill et al., 2003
). Thus, in the Gram-positive phylum the LexA-binding motif presents a CGAACRNRYGTTYC consensus sequence (Winterling et al., 1998
) that, with slight variations (Davis et al., 2002
), is conserved among all its members and is also found in the phylogenetically close green non-sulfur bacteria that, nonetheless, are Gram-negative bacteria (Fernández de Henestrosa et al., 2002
). Apart from the Gammaproteobacteria, in which the consensus sequence CTGTN8ACAG is monophyletic and seems to extend to those Betaproteobacteria that possess a lexA gene (Erill et al., 2003
), alternative LexA-binding sequences with a high degree of conservation have also been described in other groups. For instance, the direct repeat GTTCN7GTTC is the LexA-binding sequence of the Alphaproteobacteria harbouring a lexA gene, a group that includes the Rhodobacter, Sinorhizobium, Agrobacterium, Caulobacter and Brucella genera (Fernández de Henestrosa et al., 1998
; Tapias & Barbé, 1999
). Still, in other phyla where the LexA-binding motif has been identified, more data are required to gauge the conservation of the LexA box. Such is the case of the Deltaproteobacteria, for which a CTRHAMRYBYGTTCAGS consensus motif has been identified in one of its members, the fruiting body forming Myxococcus xanthus (Campoy et al., 2003
).
The existence of different LexA recognition motifs and the monophyletic or paraphyletic nature of those studied so far indicate that the appearance of new LexA-binding motifs marks turning points in the evolutionary history of both this protein and its respective host species. Previous work has demonstrated that the cyanobacteria LexA box (RGTACNNNDGTWCB) derives directly from that of Gram-positive bacteria (Mazón et al., 2004
). Nevertheless, a huge gap is still apparent in the further evolutionary pathway of the LexA box that leads from the Cyanobacteria up to other bacterial phyla of later appearance, such as the Proteobacteria. Protein signature analyses have established that Fibrobacter succinogenes branched from a common bacterial ancestor immediately before the Proteobacteria phylum (Griffiths & Gupta, 2001
). F. succinogenes is an anaerobic Gram-negative bacterium that inhabits the rumen and caecum of herbivores and, for a long time, this organism was included in the Bacteroides genus. Recent 16S rRNA analyses, however, have granted Fibrobacter a new bacterial phylum of its own (Maidak et al., 1999
; Ludwig & Schleifer, 1999
).
In an effort to recreate the evolutionary history of the LexA protein through the changes in its recognition sequence, and taking advantage of the fact that the F. succinogenes genome is now partially sequenced, the lexA gene of this bacterial species has been isolated and its encoded product has been purified to determine its DNA recognition sequence. The results obtained here are in accordance with the newly established branching point of F. succinogenes, and introduce a novel element that allows a finer drawing of the evolutionary path of the LexA recognition sequence from Gram-positive bacteria to Gammaproteobacteria.
 |
METHODS
|
---|
Bacterial strains, plasmids, oligonucleotides and DNA techniques.
Bacterial strains and plasmids used in this work are listed in Table 1
. E. coli and F. succinogenes ATCC 19169 strains were grown in LB (Sambrook et al., 1992
) and in a chemically defined medium (Gaudet et al., 1992
) with 3 g cellobiose l1, respectively. Antibiotics were added to the cultures at reported concentrations (Sambrook et al., 1992
). E. coli cells were transformed with plasmid DNA as described by Sambrook et al. (1992)
. All restriction enzymes, PCR-oligonucleotide primers, T4 DNA ligase and polymerase, and the DIG-DNA labelling and detection kit were from Roche. DNA from F. succinogenes cells was extracted as described by Forano et al. (1994)
.
The synthetic oligonucleotide primers used for PCR amplification are listed in Table 2
. To facilitate subcloning of some PCR DNA fragments, specific restriction sites were incorporated into the oligonucleotide primers. These restriction sites are identified in Table 2
. Mutants in the F. succinogenes lexA promoter were obtained by PCR mutagenesis, using oligonucleotides carrying designed substitutions (Table 2
). The DNA sequence of all PCR-mutagenized fragments was determined by the dideoxy method (Sanger et al., 1977
) on an ALF Sequencer (Amersham Pharmacia). In all cases the entire nucleotide sequence was determined for both DNA strands.
View this table:
[in this window]
[in a new window]
|
Table 2. Oligonucleotide primers used in this work
In the sequence column, added restriction sites, when present, are shown in italics, and introduced nucleotide changes are shown in lower case and underlined. The position column refers to the position of the 5' end of the oligonucleotide with respect to the proposed translational starting point of each F. succinogenes gene. EMSA, electrophoretic mobility shift assay.
|
|
Molecular cloning of the F. succinogenes lexA gene and purification of its encoded protein.
The F. succinogenes lexA gene was amplified from the total DNA of the F. succinogenes ATCC 19169 strain using the LexAup and LexAdw oligonucleotide primers (Table 2
), corresponding to nucleotides 276 to 249 and +653 to +678, with respect to its proposed translational starting point. The 954 bp PCR fragment obtained was cloned into the pGEM-T vector (Promega), obtaining the pUA1033 plasmid. To confirm that no mutation was introduced during the amplification reaction, the sequence of the fragment was determined. The plasmid pUA1038 was constructed in order to create and express a glutathione S-transferase (GST)F. succinogenes LexA fusion protein. The first step in the construction of this plasmid was to amplify the F. succinogenes lexA gene from plasmid pUA1033, using the primers LexAEcoRI and LexADw. The resulting DNA fragment was cloned into pGEM-T, to give pUA1037. Following excision with EcoRI and SalI, the lexA gene was inserted into the pGEX4T1 expression vector (Amersham Pharmacia) immediately downstream of the GST-encoding gene that is under the T7 promoter control. The initiation codon of the LexA protein was placed immediately downstream of the EcoRI sites in the LexAEcoRI primer, such that the lexA gene could be fused to GST in-frame. The insert of pUA1037 was sequenced in order to ensure that no mutations were introduced during amplification.
To overproduce the LexAGST fusion protein, the pUA1037 plasmid was transformed into the E. coli BL21(
DE3) codon-plus strain (Stratagene). Cells of the resulting BL21 codon-plus strain were diluted in 0·5 l LB medium and incubated at 37 °C until they reached an OD600 of 0·8. Fusion-protein expression was induced at this time by the addition of IPTG to a final concentration of 1 mM. Following incubation for an additional 3 h at 37 °C, cells were collected by centrifugation for 15 min at 3000 g. The bacterial pellet was resuspended in PBS buffer (10 mM Na2HPO4, 1·7 mM KH2PO4, 140 mM NaCl, 2·7 mM KCl, pH 7·4) containing Complete Mini protease inhibitors cocktail (Roche). The resulting cell suspensions were lysed by sonication. Unbroken cells and debris were removed by centrifugation for 20 min at 14 000 g. The supernatant containing the GSTLexA fusion protein was incubated with PBS/Glutathione Sepharose 4B beads (Amersham Pharmacia) for 2 h at 4 °C, in order to affinity purify the fusion protein. The beads were then washed twice with PBS containing 0·1 % Triton and three times with PBS without detergent.
The sequence Leu-Val-Pro-Arg-Gly-Ser is located immediately downstream of the GST coding sequence in the pGEX4T vector series, and serves as a linker between the LexA and GST moieties of the fusion proteins. This hexapeptide is recognized by the protease thrombin, which cleaves at the Arg-Gly bond. It was therefore possible to release the F. succinogenes LexA protein from the Sepharose beads by incubating a 700 µl bed volume of beads with 25 units of thrombin (Amersham Pharmacia) in 1 ml PBS. The supernatants containing the F. succinogenes LexA protein with an additional 5 aa tail at their N-terminal (Gly-Ser-Pro-Glu-Phe), was visualized in a Coomassie blue-stained 13 % SDS-PAGE gel (Laemmli, 1970
). Their purity was greater than 98 % (data not shown).
LexA proteins from Bacillus subtilis, E. coli, Anabaena PCC 7120, M. xanthus and Rhodobacter sphaeroides also used in this work had been previously purified (Winterling et al., 1998
; Tapias et al., 2002
; Campoy et al., 2003
; Mazón et al., 2004
).
Mobility-shift assays and DNase I footprinting.
LexADNA complexes were detected by electrophoresis mobility-shift assays (EMSAs) using purified LexA proteins. DNA probes were prepared by PCR amplification using one of the primers labelled at its 5' end with DIG (Roche) (Table 2
), purifying each product in a 23 % low-melting-point agarose gel depending on DNA size. DNAprotein reactions (20 µl), typically containing 10 ng DIG-labelled DNA probe and 40 nM of the desired purified LexA protein, were incubated in binding buffer: 10 mM HEPES/NaOH (pH 8), 10 mM Tris/HCl (pH 8), 5 % glycerol, 50 mM KCl, 1 mM EDTA, 1 mM DTT, 2 µg poly(dG-dC) and 50 µg BSA ml1. After 30 min at 30 °C, the mixture was loaded onto a 5 % non-denaturing Tris/glycine polyacrylamide gel (pre-run for 30 min at 10 V cm1 in 25 mM Tris/HCl, pH 8·5, 250 mM glycine, 1 mM EDTA). DNAprotein complexes were separated at 150 V for 1 h, followed by transfer to a Biodine B nylon membrane (Pall Gelman Laboratory). DIG-labelled DNAprotein complexes were detected by following the manufacturer's protocol (Roche). For the binding-competition experiments, a 300-fold molar excess of either specific or non-specific unlabelled competitor DNA was also included in the mixture. Protein concentrations were determined as described by Bradford (1976)
. All EMSAs were repeated a minimum of three times to ensure reproducibility of the results. DNase I footprinting assays were performed using the ALF Sequencer (Amersham Biosciences) as described previously (Patzer & Hantke, 2001
; Campoy et al., 2003
).
In silico phylogenetic analysis.
Preliminary sequence data of F. succinogenes unfinished genome was obtained from The Institute for Genomic Research (TIGR) through their website at http://www.tigr.org, and protein sequences for all other organisms were obtained from the Microbial Genome Database for Comparative Analysis website (http://mbgd.genome.ad.jp/) and the TIGR Comprehensive Microbial Resource (CMR). Identification of additional LexA-binding genes was carried out using the RCGScanner software (Erill et al., 2003
), using known E. coli LexA-governed genes (Fernández de Henestrosa et al., 2000
; Erill et al., 2003
) and the here-reported LexA box of F. succinogenes to scan and then filter through the consensus method putative LexA-binding sites across the F. succinogenes genome.
For phylogenetic analyses, protein sequences for each gene under study were aligned using the CLUSTALW program (Higgins et al., 1994
). Multiple alignments were then used to infer phylogenetic trees with the SEQBOOT, PROML and CONSENSE programs of the PHYLIP 3.6 software package (Felsenstein, 1989
), applying the maximum-likelihood method on 100 bootstrap replicates. The resulting phylogeny trees were plotted using TreeView (Page, 1996
).
 |
RESULTS
|
---|
Determination of the F. succinogenes LexA recognition DNA sequence
EMSAs with the purified F. succinogenes LexA protein were carried out to determine the ability of this protein to bind to its own promoter. As can be seen in Fig. 1(a)
, the addition of increasing concentrations of LexA to a fragment extending from 154 to +169 of the F. succinogenes lexA gene promoter (with respect to its proposed translational starting point) produces one retardation band whose intensity is directly related to the amount of protein used. The formation of this DNALexA complex is specific, since it is sensitive to competition by an excess of unlabelled lexA promoter, but not to competition by non-specific DNA (Fig. 1b
). Moreover, EMSAs performed using different-sized fragments containing the lexA promoter as a probe demonstrated that the LexA recognition sequence must lie between positions 72 and 57 of this promoter (data not shown).

View larger version (27K):
[in this window]
[in a new window]
|
Fig. 1. (a) Electrophoretic mobility of the DNA fragment containing the F. succinogenes lexA promoter in the presence of increasing concentrations of purified F. succinogenes LexA protein. (b) Effect of 300-fold molar excess of unlabelled F. succinogenes lexA promoter (lane 3) and pBSK(+) plasmid DNA (lane 4) on the migration of the F. succinogenes lexA promoter in the presence of its purified LexA protein (at 40 nM). The migration of the same fragment without any additional DNA (+) is also shown (lane 2). In both panels, the mobility of the F. succinogenes lexA promoter in the absence () of purified LexA protein is also presented as a negative control (lane 1).
|
|
To precisely identify the F. succinogenes LexA box, additional footprinting experiments with a 160 bp fragment extending from positions 154 to +6 were performed. The results obtained show that a 37 bp core region was protected by the LexA protein when both lexA-coding and non-coding strands were analysed (Fig. 2
). A visual inspection of this DNA sequence revealed the presence of the imperfect palindrome TGCCCAGTTGTGCA in its central region. To determine whether this motif was really involved in LexA binding, the effect of single substitutions in each nucleotide of this palindrome on the formation of the LexA proteinlexA promoter complex was analysed. Results (Fig. 3
) indicate that a single substitution in any position of the TGC trinucleotide, as well as in the last C of the TGCCC motif, abolishes LexA binding. Likewise, mutagenesis of any position of the GTGCA motif also inhibits DNALexA complex formation. In contrast, the substitution of single nucleotides immediately surrounding either the TGCCC or the GTGCAT motifs does not affect LexA binding. Taken together, these results show that the TGCNCNNNNGTGCA imperfect palindrome is the F. succinogenes LexA-binding sequence of the lexA gene, since it is required for the binding of the LexA protein to the lexA promoter. Additional single substitutions generating two different perfect palindromes were carried out. The binding ratio of the LexA protein to both the TGCAC-N4-GTGCA palindrome and the wild-type sequence was practically the same. In contrast, there was a substantial reduction (about 75 %) in the LexA-binding ratio to the TGCCC-N4-GGGCA palindrome. This demonstrates that a TGCAC-N4-GTGCA perfect palindrome is the most likely consensus for the LexA box of F. succinogenes.

View larger version (20K):
[in this window]
[in a new window]
|
Fig. 2. DNase I footprinting assays with coding and non-coding Cy5-labelled strands of the DNA fragment containing the F. succinogenes lexA promoter in the absence or presence of increasing amounts of purified LexA protein from the same organism. The translational starting codon is shown in bold and underlined.
|
|

View larger version (21K):
[in this window]
[in a new window]
|
Fig. 3. Single-nucleotide substitutions in the TTGCCCAGTTGTGCAT imperfect palindrome and their effect on the electrophoretic mobility of the F. succinogenes lexA promoter in the presence of purified F. succinogenes LexA protein (at 40 nM). The mobility of the wild-type F. succinogenes lexA promoter in the absence () or presence (+) of LexA from this same organism is also shown.
|
|
Identification of additional LexA-binding F. succinogenes genes
The characteristic amino acid residues of LexA proteins (an Ala-Gly bond separated about 34 positions from a Ser residue that is 37 positions away from a Lys residue) are also present in at least two other prokaryotic protein families: UmuD (encoding DNA polymerase V, which is involved in error-prone DNA repair) and lytic cycle prophage repressors (such as the
cI protein) (Little, 1984
; Burckhardt et al., 1988
; Nohmi et al., 1988
). Nevertheless, of these two proteins only the prophage repressors are able to bind DNA-specific sequences. To eliminate the possibility that the F. succinogenes LexA was in fact a residual prophage repressor, a phylogenetic analysis of LexA proteins from several bacterial groups was performed. The results obtained (Fig. 4
), together with phylogenetic trees including the
cI repressor as an outgroup, indicate that the F. succinogenes LexA protein identified here is most probably a descendant of a Gram-positive LexA protein, and rule out the possibility of lateral gene transfer (LGT) from such an unspecified source as a residual prophage. To further validate this hypothesis, an in silico analysis of the F. succinogenes genome sequence was carried out using the RCGScanner program (Erill et al., 2003
) in search of other genes with significant TGCNCNNNNGTGCA-like palindrome motifs upstream of their coding regions. Imperfect palindrome motifs were found upstream of the recA (TGCACAAAAGTTCA), uvrA (TATTCAAATGTTCA), ssb (TGCCTCCTCGAGCA) and ruvAB (AGCTCAAAGGCGCA) genes, and competitive EMSA experiments demonstrated that their promoters also bind F. succinogenes LexA (Fig. 5
). Since these genes are under the control of the LexA protein in many bacterial species, the possibility that the F. succinogenes lexA gene identified here was the result of convergent evolution from a residual prophage repressor was definitively eliminated.

View larger version (32K):
[in this window]
[in a new window]
|
Fig. 4. Phylogenetic tree of the LexA protein sequence. Gram-positive: B. subtilis (Bsu), Clostridium perfringens (Cpe), Mycobacterium tuberculosis (Mtu), Staphylococcus aureus (Sav), Streptomyces coelicolor (Sco); Cyanobacteria: Anabaena (Ana), Prochlorococcus marinus (Pmi), Synechocystis (Syn); F. succinogenes (Fbs); Delta Proteobacteria: M. xanthus (Myx); Alpha Proteobacteria: A. tumefaciens (Atc), Bradyrhizobium japonicum (Bja), Brucella melitensis (Bme), Brucella suis (Brs), Caulobacter crescentus (Ccr), Mesorhizobium loti (Mlo), Rhodopseudomonas palustris (Rpa), Sinorhizobium meliloti (Sme); Beta Proteobacteria: Bordetella pertussis (Bpe), Ralstonia solanacearum (Rso); Gamma Proteobacteria: E. coli (Eco), Haemophilus influenzae (Hin), Shewanella oneidensis (Son), Vibrio cholerae (Vch), Vibrio parahaemolyticus (Vpa), Vibrio vulnificus (Vvu), Yersinia pestis (Ype). Numbers at branch nodes indicate bootstrapping values for 100 bootstrap replicates. Bar, ** substitutions.
|
|

View larger version (14K):
[in this window]
[in a new window]
|
Fig. 5. Electrophoretic mobility of the wild-type F. succinogenes lexA promoter in the presence of its LexA protein (40 nM) and a 300-fold molar excess of unlabelled fragments containing about 400 bp of the upstream region of the F. succinogenes recA, uvrA, ruvAB and ssb genes. As a control, the effect on lexA promoter mobility of the addition of unlabelled lexA and trp promoters is displayed in the presence of the same amount of LexA protein. The mobility of the lexA promoter in the absence of any additional DNA (+) or purified LexA protein () is also shown.
|
|
Comparative analysis of the F. succinogenes LexA protein and its recognition sequence
The phylogenetic tree shown in Fig. 4
was constructed from a multiple alignment of available LexA protein sequences from relevant members of the Gram-positive and Cyanobacteria phyla and the Alphaproteobacteria, Betaproteobacteria and Gammaproteobacteria classes, and those of both F. succinogenes and M. xanthus. As expected, the resulting tree reveals that all these LexA proteins derive from the LexA of Gram-positive bacteria. However, closer examination of the phylogenetic tree indicates that at least two divergent paths originated from the Gram-positive LexA protein: one leading to the Deltaproteobacteria, Betaproteobacteria and Gammaproteobacteria LexA, with the F. succinogenes LexA as an intermediate step, and the other giving rise to the Cyanobacteria LexA and, unexpectedly, the Alphaproteobacteria LexA.
To analyse whether the relationships between the different LexA proteins displayed in the phylogenetic tree were also reflected in their respective binding sites, a sequence comparison between the aforementioned LexA-binding sequences and that of F. succinogenes was carried out. This comparison reveals the presence of marked resemblances among several nucleotide positions (Fig. 6
) that are consistent with a common phylogenetic origin. Moreover, and in accordance with the dual-branching hypothesis prompted by LexA protein phylogeny, on close inspection these resemblances again suggest two putative evolutionary lanes emerging from the Gram-positive LexA box: one giving rise to the Cyanobacteria and Alphaproteobacteria LexA box and the other leading to both the F. succinogenes and M. xanthus LexA boxes and, ultimately, resulting in the Betaproteobacteria and Gammaproteobacteria LexA box.

View larger version (15K):
[in this window]
[in a new window]
|
Fig. 6. Schematic diagram representing the similarities between LexA recognition sites of different bacterial clades and the possible generation of several LexA boxes following the two apparent evolutionary lanes that emerge from Gram-positive bacteria. Bases belonging to the palindromic motif of the Gram-positive LexA box that are conserved through the evolutionary history of the LexA recognition sequence are shaded. Changes to the LexA-binding sequence are highlighted in bold at the step in which they were introduced.
|
|
Genesis of different LexA boxes through directed mutagenesis of the F. succinogenes LexA-binding sequence
To further confirm the putative relationship between the LexA proteins described above, the vertical evolutionary path leading from Gram-positive bacteria to Gammaproteobacteria was experimentally analysed, taking the F. succinogenes LexA recognition sequence as a starting point to generate, through directed mutagenesis, the LexA-binding sequences of Gram-positive bacteria, Myxococcus, and Betaproteobacteria and Gammaproteobacteria. As can be seen (Fig. 7a
), the B. subtilis LexA protein is able to bind the F. succinogenes LexA box with the introduction of only five substitutions (the T as well as the two internal Cs of the TGCCC motif, plus the internal G and A of GTGCAT), a number that, considering the evolutionary distance between the species, is remarkably low. Similarly, the M. xanthus LexA protein is able to bind a F. succinogenes lexA-derivative promoter in which only the flanking bases at each end of the TGCCCAGTTGTGCA palindrome have been substituted for a C and a G, respectively, and the G of the internal GTGCA motif has been replaced by a T. Finally, the E. coli LexA protein can effectively bind to the lexA promoter recognized by M. xanthus LexA if only three additional changes to the mutant promoter are made: substitution of the CC duet for TA on the TGCC tetranucleotide and a change from T to A in the TTC trinucleotide. The fact that both these generated motifs are very close to experimentally validated LexA-binding motifs of B. subtilis and E. coli (Fig. 7a
) indicates that a mutational transition similar to the one proposed here could certainly have taken place between the LexA-binding sequences of these species.

View larger version (31K):
[in this window]
[in a new window]
|
Fig. 7. (a) Binding ability of B. subtilis, F. succinogenes, M. xanthus and E. coli LexA proteins to the F. succinogenes lexA wild-type promoter (Fbs Wt) and several mutant derivatives. (b) Binding ability of Anabaena and R. sphaeroides LexA proteins to the Anabaena lexA wild-type promoter (Ana Wt) and several mutant derivatives. All changes were introduced through directed mutagenesis according to the comparative schematic diagram of LexA boxes shown in Fig. 6 . In all cases, , ± or + denote, respectively: no LexA binding, LexA binding with a percentage of bound probe lower than 25 %, and LexA binding with a percentage of bound probe higher than 25 %. Bases of F. succinogenes and Anabaena LexA boxes that are required for binding of their own LexA protein are overlined in each panel. For each mutagenesis step, the bases either modified or added are shown in bold, and the change is indicated with an arrow. Likewise, changes introduced in a previous step remain underlined in subsequent steps. Experimentally confirmed LexA-binding sequences of B. subtilis (Bsu), E. coli (Eco) and R. sphaeroides (Rps) are shown for comparison.
|
|
Derivation of the Alphaproteobacteria LexA-binding sequence from the cyanobacterial LexA box
To complete the above-described analysis of the evolutionary relationship of LexA proteins through their binding sequences, a similar study was conducted to check the feasibility of the remaining branching line from Gram-positive bacteria (i.e. the one giving rise to Cyanobacteria and Alphaproteobacteria LexA proteins). In concordance with the hypothesis presented in Fig. 6
, it was found that the simple addition of three nucleotides (chosen in accordance with the Alphaproteobacteria LexA-box consensus sequence) between the AGTAC and GTTC motifs of the cyanobacterial LexA box was sufficient to enable the binding of the R. sphaeroides LexA protein to the mutant LexA box in the Anabaena lexA gene promoter (Fig. 7b
). Furthermore, and although significant binding of the R. sphaeroides LexA protein to the Anabaena lexA promoter could be easily accomplished with the single insertion event described above, the introduction of an additional single-point mutation (substitution of T for A in the GTAC tetranucleotide) to the mutant Anabaena lexA promoter dramatically increased the recognition ability of the R. sphaeroides LexA repressor (Fig. 7b
). Again, the fact that experimentally confirmed LexA-binding motifs closely resembling the motifs generated here are present in R. sphaeroides (Fig. 7b
) gives further support to the evolutionary pathway here proposed.
 |
DISCUSSION
|
---|
In this work we have demonstrated that, through a programmed set of nucleotide changes, both the Gram-positive and E. coli-like LexA boxes can be obtained from the F. succinogenes LexA-binding sequence. Furthermore, our results point out that the G and C corresponding to the most external positions of the GAACN4GTTC motif recognized by the Gram-positive LexA repressor are enclosed in the CTGT and ACAG sequences, respectively, found in the E. coli-like LexA box. In this way, the origin of the E. coli LexA recognition sequence (16 nucleotides) could be explained by a 2 bp size increase of the Gram-positive LexA-binding sequence (12 nucleotides) at each end. Nevertheless, this extension of the LexA recognition motif does not seem to have led to a significant increase in the size of the N-terminal domain region of the LexA protein that contains the three
helices involved in DNA binding (Fig. 8
). A straight comparison of the N-terminal domain of F. succinogenes and M. xanthus LexA protein sequences with the consensus sequences of this region for Gram-positive, Cyanobacteria and Alphaproteobacteria LexA proteins reveals no amino acid insertions in the residues that, in E. coli, have been shown to participate directly in DNA-binding activity, nor in their immediate neighbours (Fig. 8
).
Moreover, this comparative analysis of LexA protein sequences shows several fully conserved residues amongst those that constitute the three predicted
helices that are involved in DNA binding. This suggests that, since their respective LexA boxes are markedly different, these amino acids must be required for the maintenance of the overall DNA recognition complex instead of being used for specific binding. This is the case for T5, Q8, E10, P26, S39, L50, G54 and R64, following the numbering of the E. coli LexA protein. Likewise, other residues present a low degree of substitution that, in addition, correspond to amino acids of the same family: L4, I15, E30, L47, K53, I56 and I66. This fact suggests that these residues must also be related to structural functions of the LexA helixturnhelix (HTH) complex rather than to the specific recognition of the DNA-binding sequence. It has been suggested that, in E. coli, the third
helix of the LexA HTH complex plays the leading role in specific DNA recognition (Knegtel et al., 1995
). However, other residues in the remaining
helices or between them must also play a significant part in specific DNA recognition, since a F. succinogenes LexA protein derivative in which the sequence of the third
helix has been replaced through directed mutagenesis with that of E. coli LexA can not bind the E. coli-like CTGTN8ACAG motif (data not shown).
Furthermore, we have also demonstrated that a functional Alphaproteobacteria LexA-binding sequence may be easily generated from the cyanobacterial one through a single insertion event while, in turn, the cyanobacterial LexA box derives directly from the Gram-positive one (Mazón et al., 2004
). The use of DNA recognition motifs in combination with other phylogenetic evidence has been proposed earlier as a measure of divergence to refine phylogenetic analyses and as a milestone to highlight branching points in evolution (Rodionov et al., 2001
; Rajewsky et al., 2002
; Erill et al., 2003
). Therefore, the experimental evidence of relatedness between Alpha and Cyanobacteria LexA boxes takes on a new relevance when combined with the fact that these two groups do also cluster together in the phylogenetic tree of LexA proteins (Fig. 4
). This close relationship between Alphaproteobacteria and Cyanobacteria is clearly at odds with the traditional positioning of the Alphaproteobacteria class in the bacterial evolutionary tree, as prompted by the RecA protein (Fig. 9
; Eisen, 1995
) and 16S rRNA and signature protein phylogenies (Woese et al., 1984
; Gupta & Griffiths, 2002
), since these three phylogenetic techniques place the Alphaproteobacteria very close to the Betaproteobacteria and far removed from either Cyanobacteria or Gram-positive bacteria. The best explanation for this divergence from conventional phylogenetic data is to suppose that, after branching from other Proteobacteria classes, Alphaproteobacteria lost their vertically transmitted lexA gene, but incorporated later a novel lexA copy through LGT from either a cyanobacterium or a bacterial species closely related to this phylum. This LGT event, however, must have occurred very early in the evolutionary history of the Alphaproteobacteria, since the same protein is present in all Alphaproteobacteria that have not suffered major reductions in chromosome size (e.g. Rickettsia), and GC content and codon usage of the extant lexA genes are in perfect agreement with the mean values for each of the Alphaproteobacteria hosting them. In this context, it should be stressed that the loss of the lexA gene does not seem to be a very unusual event in bacterial evolution, as it has already been described in several genera (including Aquifex, Borrelia, Campylobacter, Chlamydia, Helicobacter, Mycoplasma and Rickettsia). Up to now, a common characteristic of those bacteria for which the lack of a lexA gene had been described was that they had undergone a major reduction in chromosome size, suggesting that massive genome reduction was a convergent evolutionary cause for the loss of the lexA gene. However, given that the Alphaproteobacteria species analysed here do not show significant reductions in genetic material, our data concerning their LexA protein breaks with this traditional assumption and hints at the possible existence of losses and lateral acquisitions of the lexA gene among bacteria. Although further work is still necessary to elucidate whether similar LGT events have taken place in other bacterial phyla, the evidence reported here of lateral transfer of the lexA gene sheds new light on the evolutionary history of complex regulatory networks like the LexA-governed SOS response and validates the previously reported use of regulatory motifs, in combination with phylogenetic and protein signature studies, as reliable indicators of phylogenetic history.
 |
ACKNOWLEDGEMENTS
|
---|
This work was funded by Grants BMC2001-2065 from the Ministerio de Ciencia y Tecnología (MCyT) de España and 2001SGR-206 from the Departament d'Universitats, Recerca i Societat de la Informació (DURSI) de la Generalitat de Catalunya, and by the Consejo Superior de Investigaciones Científicas (CSIC). We are deeply indebted to Dr Roger Woodgate for his generous gifts of E. coli and B. subtilis LexA proteins. We wish to acknowledge Joan Ruiz for his excellent technical assistance and collaboration. The free access to the F. succinogenes preliminary sequence data at The Institute for Genomic Research (TIGR) is also acknowledged. Partial sequencing of F. succinogenes was accomplished with support from the US Department of Agriculture.
 |
REFERENCES
|
---|
Bradford, M. M. (1976). A rapid and sensitive method for the quantification of microgram quantities of protein utilizing the principle of protein-dye binding. Anal Biochem 72, 248254.[CrossRef][Medline]
Burckhardt, S. E., Woodgate, R., Scheuermann, H. R. & Echols, H. (1988). UmuD mutagenesis protein of Escherichia coli: overproduction, purification and cleavage by RecA. Proc Natl Acad Sci U S A 85, 18111815.[Abstract]
Campoy, S., Fontes, M., Padmanabhan, S., Cortes, P., Llagostera, M. & Barbe, J. (2003). LexA-independent DNA damage-mediated induction of gene expression in Myxococcus xanthus. Mol Microbiol 49, 769781.[CrossRef][Medline]
Combet, C., Blanchet, C., Geourjon, C. & Deléage, G. (2000). NPS@: network protein sequence analysis. Trends Biochem Sci 25, 147150.[CrossRef][Medline]
Courcelle, J., Khodursky, A., Peter, B., Brown, P. O. & Hanawalt, P. C. (2001). Comparative gene expression profiles following UV exposure in wild-type and SOS-deficient Escherichia coli. Genetics 158, 4164.[Abstract/Free Full Text]
Davis, E. O., Dullaghan, E. M. & Rand, L. (2002). Definition of the mycobacterial SOS box and use to identify LexA-regulated genes in Mycobacterium tuberculosis. J Bacteriol 184, 32873295.[Abstract/Free Full Text]
Eisen, J. A. (1995). The RecA protein as a model molecule for molecular systematic studies of bacteria: comparison of trees of RecAs and 16S rRNAs from the same species. J Mol Evol 41, 11051123.[Medline]
Erill, I., Escribano, M., Campoy, S. & Barbé, J. (2003). In silico analysis reveals substantial variability in the gene contents of the Gamma Proteobacteria LexA-regulon. Bioinformatics 19, 22252236.[Abstract/Free Full Text]
Felsenstein, J. (1989). PHYLIP: phylogeny inference package (version 3.2). Cladistics 5, 164166.
Fernández de Henestrosa, A. R., Rivera, E., Tapias, A. & Barbé, J. (1998). Identification of the Rhodobacter sphaeroides SOS box. Mol Microbiol 28, 9911003.[CrossRef][Medline]
Fernández de Henestrosa, A. R., Ogi, T., Aoyagi, S., Chafin, D., Hayes, J. J., Ohmori, H. & Woodgate, R. (2000). Identification of additional genes belonging to the LexA regulon in Escherichia coli. Mol Microbiol 35, 15601572.[CrossRef][Medline]
Fernández de Henestrosa, A. R., Cuñé, J., Erill, I., Magnuson, J. K. & Barbé, J. (2002). A green nonsulfur bacterium, Dehalococcoides ethenogenes, with the LexA binding sequence found in gram-positive organisms. J Bacteriol 184, 60736080.[Abstract/Free Full Text]
Fogh, R. H., Ottleben, G., Rüterjans, H., Schnarr, M., Boelens, R. & Kaptein, R. (1994). Solution structure of the LexA repressor DNA binding domain determined by 1H NMR spectroscopy. EMBO J 13, 39363944.[Abstract]
Forano, E., Broussolle, V., Gaudet, G. & Bryant, J. A. (1994). Molecular cloning, expression and characterization of a new endoglucanase gene from Fibrobacter succinogenes S85. Current Microbiol 28, 714.
Gaudet, G., Forano, E., Dauphin, G. & Delort, A. M. (1992). Futile cycling of glycogen in Fibrobacter succinogenes as shown by in situ 1H-NMR and 13C-NMR investigation. Eur J Biochem 207, 155162.[Abstract]
Griffiths, E. & Gupta, R. S. (2001). The use of signature sequences in different proteins to determine the relative branching order of bacterial divisions: evidence that Fibrobacter diverged at a similar time to Chlamydia and the Cytophaga-Flavobacterium-Bacteroides division. Microbiology 147, 26112622.[Medline]
Gupta, R. S. & Griffiths, E. (2002). Critical issues in bacterial phylogeny. Theor Popul Biol 61, 423434.[CrossRef][Medline]
Higgins, D., Thompson, J., Gibson, T., Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 46734680.[Abstract]
Knegtel, R. M. A., Fogh, R. H., Ottleben, G., Rüterjans, H., Dumoulin, P., Schnarr, M., Boelens, R. & Kaptein, R. (1995). A model for the LexA repressor DNA complex. Proteins 21, 226236.[Medline]
Laemmli, U. K. (1970). Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227, 680685.[Medline]
Little, J. W. (1984). Autodigestion of lexA and phage lambda repressors. Proc Natl Acad Sci U S A 81, 13751379.[Abstract]
Little, J. W. (1991). Mechanism of specific LexA cleavage: autodigestion and the role of RecA coprotease. Biochimie 73, 411422.[CrossRef][Medline]
Ludwig, W. & Schleifer, K. H. (1999). Phylogeny of Bacteria beyond the 16S rRNA standard. ASM News 65, 752757.
Luo, Y., Pfuetzner, R. A., Mosimann, S., Paetzel, M., Frey, E. A., Cherney, M., Kim, B., Little, J. W. & Strynadka, C. J. (2001). Crystal structure of LexA: a conformational switch for regulation of self-cleavage. Cell 106, 585594.[Medline]
Maidak, B. L., Cole, J. R., Parker, C. T., Jr & 11 other authors (1999). A new version of the RDP (Ribosomal Database Project). Nucleic Acids Res 27, 171173.[Abstract/Free Full Text]
Mazón, G., Lucena, J. M., Campoy, S., Fernández de Henestrosa, A. R., Candau, P. & Barbé, J. (2004). LexA-binding sequences in Gram-positive and cyanobacteria are closely related. Mol Genet Genomics 271, 4049.[CrossRef][Medline]
Nohmi, T., Battista, J. R., Dodson, L. A. & Walker, G. C. (1988). RecA-mediated cleavage activates UmuD for mutagenesis: mechanistic relationship between transcriptional derepression and posttranslational activation. Proc Natl Acad Sci U S A 85, 18161820.[Abstract]
Norioka, N., Hsu, M. Y., Inouye, S. & Inouye, M. (1995). Two recA genes in Myxococcus xanthus. J Bacteriol 177, 41794182.[Abstract]
Page, R. D. M. (1996). TreeView: an application to display phylogenetic trees on personal computers. CABIOS 12, 357358.[Medline]
Patzer, S. I. & Hantke, K. (2001). Dual repression by Fe2+-Fur and Mn2+-MntR of the mntH gene, encoding an NRAMP-like Mn2+ transporter in Escherichia coli. J Bacteriol 183, 48064813.[Abstract/Free Full Text]
Rajewsky, N., Socci, N., Zapotocky, M. & Siggia, E. D. (2002). The evolution of DNA regulatory regions for proteo-gamma bacteria by interspecies comparisons. Genome Res 12, 298308.[Abstract/Free Full Text]
Rodionov, D. A., Mironov, A. A. & Gelfand, M. S. (2001). Transcriptional regulation of pentose utilisation systems in the Bacillus/Clostridium group of bacteria. FEMS Microbiol Lett 205, 305314.[CrossRef][Medline]
Sambrook, J., Fritsch, E. F. & Maniatis, T. (1992). Molecular Cloning: a Laboratory Manual, 2nd edn. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory.
Sanger, F., Nicklen, S. & Coulson, S. (1977). DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A 74, 54635467.[Abstract]
Sassanfar, M. & Roberts, J. W. (1990). Nature of SOS-inducing signal in Escherichia coli. The involvement of DNA replication. J Mol Biol 212, 7996.[Medline]
Tapias, A. & Barbé, J. (1999). Regulation of divergent transcription from the uvrA-ssb promoters in Sinorhizobium meliloti. Mol Gen Genet 262, 121130.[CrossRef][Medline]
Tapias, A., Fernández, S., Alonso, J. C. & Barbé, J. (2002). Rhodobacter sphaeroides LexA has dual activity: optimising and repressing recA gene transcription. Nucleic Acids Res 30, 15391546.[Abstract/Free Full Text]
Walker, G. C. (1984). Mutagenesis and inducible responses to deoxyribonucleic acid damage in Escherichia coli. Microbiol Rev 48, 6093.[Medline]
Winterling, K. W., Chafin, D., Hayes, J. J., Sun, J., Levine, A. S., Yasbin, R. E. & Woodgate, R. (1998). The Bacillus subtilis DinR binding site: redefinition of the consensus sequence. J Bacteriol 180, 22012211.[Abstract/Free Full Text]
Woese, C. R., Stackebrandt, E., Weisburg, W. C. & 8 other authors (1984). The phylogeny of purple bacteria: the alpha subdivision. Syst Appl Microbiol 5, 315326.[Medline]
Received 10 May 2004;
revised 14 July 2004;
accepted 15 July 2004.
Copyright © 2004 Society for General Microbiology.