A family of putative MSCRAMMs from Enterococcus faecalis

Jouko Sillanpää1, Yi Xu1, Sreedhar R. Nallapareddy2, Barbara E. Murray2 and Magnus Höök1

1 Texas A&M University System Health Science Center, Institute of Biosciences and Technology, Center for Extracellular Matrix Biology, Houston, TX 77030, USA
2 University of Texas Medical School, Division of Infectious Diseases, Department of Internal Medicine, and Department of Microbiology and Molecular Genetics, Center for the Study of Emerging and Re-emerging Pathogens (CERP), Houston, TX 77030, USA

Correspondence
Magnus Höök
mhook{at}ibt.tamushsc.edu


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
The recently published Enterococcus faecalis genome [Paulsen, I. T., Banerjei, L., Myers, G. S. & 29 other authors (2003). Science 299, 2071–2074)] was examined and 41 putative cell-wall-anchored proteins were identified. Seventeen of these proteins are predicted to contain tandemly repeated immunoglobulin-like folds characteristic of the structural organization of staphylococcal adhesins of the MSCRAMM (microbial surface component recognizing adhesive matrix molecules) type. Two of the nine proteins selected for further study appear to represent cell-wall-anchored enzymes. It is proposed that the remaining seven proteins constitute a family of structurally related proteins potentially interacting with proteins of the host. This family includes the previously identified collagen/laminin-binding MSCRAMM ACE [Rich, R. L., Kreikemeyer, B., Owens, R. T., LaBrenz, S., Narayana, S. V., Weinstock, G. M., Murray, B. E. & Hook, M. (1999). J Biol Chem 274, 26939–26945]. It is further demonstrated that genes encoding the seven putative MSCRAMMs are present in all E. faecalis strains tested and these proteins appear to be expressed during infection in humans, since sera from infected individuals contain antibodies reacting with recombinant versions of the enterococcal proteins.


Abbreviations: CD, circular dichroism; CWA, cell-wall anchor; ECM, extracellular matrix; Ig, immunoglobulin; MSCRAMM, microbial surface component recognizing adhesive matrix molecule


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Enterococcus faecalis, which forms part of the commensal intestinal flora of mammals, has emerged as a major opportunistic nosocomial pathogen during the last two decades, often causing infections in hospitalized patients receiving antibiotic therapy. Disease-associated strains of this species frequently harbour a multitude of acquired and intrinsically evolved resistance mechanisms targeting the most commonly used antibiotics, which has complicated the treatment of enterococcal infections (Huycke et al., 1998; Murray, 1990; Murray & Weinstock, 1999; Tailor et al., 1993). Many of the antibiotic resistance genes are located on mobile genetic elements, e.g. small plasmids and transposons (Paulsen et al., 2003), which allows for rapid dissemination of antibiotic resistance among enterococcal strains. We are now approaching a stage where enterococci potentially could acquire resistance to all available clinically active antibiotics. This situation underlines the need for a better understanding of the molecular mechanisms involved in the pathogenesis of enterococcal infections as a foundation for the development of new drugs and treatment strategies.

The ability to adhere to host tissues is a critical step in the onset of most microbial infections. E. faecalis along with Staphylococcus aureus, another major nosocomial pathogen, is primarily an extracellular pathogen and presumably adheres to components of the extracellular matrix (ECM). In fact, earlier studies have shown that enterococci can attach to substrates composed of collagens, laminin, vitronectin, fibronectin, fibrinogen, lactoferrin and thrombospondin (Rozdzinski et al., 2001; Styriak et al., 2002, 1999; Xiao et al., 1998; Zareba et al., 1997). In general, the responsible enterococcal adhesins have not been identified or characterized. ACE, a collagen-binding adhesin of the MSCRAMM (microbial surface component recognizing adhesive matrix molecules) family is an exception. This protein was previously identified in our laboratories based on its sequence similarity to the staphylococcal MSCRAMM CNA (Nallapareddy et al., 2000b; Rich et al., 1999). Both proteins bind to multiple sites in collagen but with different kinetics. Furthermore, ACE but not CNA appears to bind to laminin. The identification of other enterococcal MSCRAMMs has been complicated by an apparently stringent regulation of the expression of the adhesins in E. faecalis, so that bacteria grown under conventional in vitro conditions usually do not adhere efficiently to the different ECM components (Nallapareddy et al., 2000a; Xiao et al., 1998).

In contrast, S. aureus adhesins can readily be isolated from bacteria grown in broth under conventional conditions, and these proteins have been characterized in more detail. The S. aureus adhesins identified so far are mostly cell-wall-anchored proteins that belong to the MSCRAMM family. These proteins have a similar structural organization. A signal peptide is followed by a so-called A-domain. The A-domain appears to contain the primary ligand binding site of most of the staphylococcal MSCRAMMs. X-ray crystal structure information as well as structural modelling analysis suggest that the A-domains contain two or three subdomains that each adopt an immunoglobulin-like (Ig-like) fold (Deivanayagam et al., 2002; Perkins et al., 2001; Ponnuraj et al., 2003). A recent structural analysis of the Staphylococcus epidermidis fibrinogen-binding MSCRAMM SdrG in complex with a peptide ligand allowed us to propose a sophisticated ligand-binding mechanism that we have called ‘dock, lock and latch’ (Ponnuraj et al., 2003). In this model, the ligand peptide docks in a pocket formed in the interface between two Ig-folded subdomains in a process that redirects the C-terminal extension of the MSCRAMM to lock the ligand peptide in place. The MSCRAMM extension then acts as a latch, inserts into a cleft and complements a {beta}-sheet of a neighbouring Ig-folded subdomain at the N terminus. The latching cleft contains a sequence motif TYTFTDYVD that is found in a similar position in all S. aureus MSCRAMMs identified so far (Josefsson et al., 1998; Ponnuraj et al., 2003). It is plausible that this binding mechanism is also used by other MSCRAMMs. The A-domains are followed by segments composed of repeated domains. At least some repeated units can also adopt an Ig-like fold (Deivanayagam et al., 2000), resulting in some MSCRAMMs (e.g. CNA) containing almost exclusively repeated Ig-like folds. At the C-terminal end of the S. aureus MSCRAMMs are features required for cell-wall anchoring, including an LPXTG (X being any amino acid residue) motif preceded by a cell-wall-spanning domain, followed by a hydrophobic transmembrane region and, finally, a cytosolic tail composed of a short sequence rich in positively charged amino acid residues (Patti & Hook, 1994). The LPXTG motif is recognized by a transpeptidase called sortase which is responsible for catalysing the covalent linking of these proteins to the staphylococcal cell wall (Mazmanian et al., 2001; Schneewind et al., 1992). This cell-wall anchoring mechanism seems to be widely utilized among Gram-positive bacteria, since sortase homologues have been identified in almost all Gram-positive species examined (Pallen et al., 2001).

In the current study, we used a bioinformatic approach to discover novel MSCRAMM-like proteins in the pre-published genome of E. faecalis strain V583 (Paulsen et al., 2003). We expressed the A-domains of the identified proteins and determined their secondary structure composition by circular dichroism (CD) spectroscopy. We also determined the presence of the corresponding genes among different E. faecalis isolates and the expression of the proteins during enterococcal infection in humans.


   METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Bacterial strains and culture conditions.
Thirty E. faecalis clinical isolates (from endocarditis, blood, urine and wounds) representing broad genomic diversity, as assessed by pulsed-field gel electrophoresis, were selected arbitrarily from B. E. Murray's collection. These were used in hybridization experiments to examine the distribution of putative MSCRAMM genes. E. faecalis strains were routinely grown in brain heart infusion medium (Difco) overnight at 37 °C. E. faecalis V583, the sequenced strain (Sahm et al., 1989), was used as a source for cloned constructs unless otherwise stated. Escherichia coli strains JM109 (Stratagene), XL1Blue (Stratagene) and M15(pREP4) (Qiagen) were grown in Luria–Bertani (Sigma) medium with appropriate antibiotics at 37 °C.

Genome search for cell-wall anchor (CWA) proteins.
Initially, the genome sequence of E. faecalis strain V583 that contains plasmids was downloaded from the TIGR website (www.tigr.org) in four contigs to a local Silicon Graphics machine. ORFs were predicted using Glimmer 2 (obtained from TIGR and installed locally) with a minimum of 500 nt. The ORFs were then translated into amino acid sequences using a custom designed translation program that is capable of translating batch sequences. The process was automated using Unix C shell scripts. The peptide sequences were formatted into a BLAST searchable database using Formatdb obtained from NCBI and installed locally.

Amino acid sequences in the Gram-positive CWA family (Gram-pos_anchor) were obtained from the protein family database of alignments and HMMs (www.sanger.ac.uk/cgi-bin/Pfam/). These sequences were analysed for the occurrence of amino acid residues at the LPXTG location. The pattern LPX[TSA][GANS] covers 95 % of all the sequences in the family and was used as the search pattern in PHI-BLAST using locally installed stand-alone BLAST obtained from NCBI. The amino acid sequences of several proteins were used as templates in PHI-BLAST, including ACE, a known cell-wall-anchored protein of E. faecalis, as well as CNA and protein A, two known cell-wall-anchored proteins of S. aureus with different sequences and structures. The output from the PHI-BLAST searches was combined and analysed to select for proteins that contain typical features of cell-wall-anchored proteins: a signal peptide at the N terminus (predicted using the SignalP server at www.cbs.dtu.dk/services/SignalP/), the LPXTG-motif close to the C terminus (visual examination) followed by a hydrophobic transmembrane segment (predicted using the TMHMM server at www.cbs.dtu.dk/services/TMHMM-2.0/) and several positively charged residues at the C terminus (visual examination). The above procedure was repeated as the genome was updated. In all, 29 putative cell-wall-anchored proteins were found, 11 of which encode sex-pheromone-related functions. After the genome was annotated and published (Paulsen et al., 2003), the TIGR database was searched for proteins annotated as LPXTG-motif CWA domain proteins. The results of the different searches were compared.

Construction of expression plasmids.
Genomic DNA from E. faecalis V583 was isolated from cells grown overnight with the G NOME genomic DNA isolation kit, according to the manufacturer's instructions (BIO 101), dissolved in TE and stored at 4 °C. Primers for cloning PCR-generated DNA fragments from the V583 genome into expression vector pQE-30 (Qiagen) are listed in Table 1. Standard cloning procedures were used. Ligation mixtures were initially transformed into Escherichia coli JM109 or XL1Blue or directly into the expression host M15(pREP4). Cloned sequences were confirmed with automated DNA sequencing (ABI Prism DNA sequencer; Applied Biosystems) at the University of Texas Medical School DNA sequencing facility, using pQE-30 sequencing primers as well as internal primers. In the case of EF1093, the pQE-30-based clone contained a stretch of six extra nucleotides as compared to the sequence deposited in the genome database, resulting in the addition of two amino acids; histidine (196) and valine (197) in the sequence. To exclude possible PCR and cloning errors, the presence of the HV insertion in the V583 genome was confirmed by sequencing over this region directly from genomic DNA.


View this table:
[in this window]
[in a new window]
 
Table 1. Synthetic oligonucleotides used in this study

 
Expression and purification of His-tag proteins.
One-litre Escherichia coli M15(pREP4) cultures harbouring appropriate pQE-30-based constructs were grown to OD600=0·6 following an initial 2 % inoculation from overnight cultures. After 2–3 h induction with 0·4 mM IPTG, cells were collected with centrifugation, resuspended in 10 mM Tris/HCl, 100 mM NaCl, pH 7·9, and stored at –80 °C. To lyse the cells and release the expressed protein, cells were passed twice through a French press with a gauge pressure setting at 1100 p.s.i. to give an estimated internal cell pressure of 18 000 p.s.i. The lysate was centrifuged at an RCFmax (maximum relative centrifugal force) of 165 000 g and the supernatant was filtered through a 0·45 µm filter. The volume was adjusted to 15 ml with 10 mM Tris/HCl, 100 mM NaCl, pH 7·9. Imidazole was then added to the sample to a concentration of 6·5 mM to minimize non-specific binding. The sample was loaded onto a nickel-charged iminodiacetic acid-Sepharose chromatography column (HiTrap chelating HP; Amersham Biosciences) that had previously been equilibrated with 10 mM Tris/HCl, 100 mM NaCl, pH 7·9, and connected to an FPLC system (Pharmacia). Bound protein was eluted with a linear gradient of 0–100 mM imidazole in 10 mM Tris/HCl, 100 mM NaCl, pH 7·9, over 100–200 ml. Protein-containing fractions were analysed by SDS-PAGE, appropriately pooled and dialysed against 25 mM Tris/HCl, 1 mM EDTA, pH 6·5–9·0 (depending on the calculated pI of the protein purified), before applying the samples to an anion-exchange Sepharose column (HiTrap Q HP; Amersham Biosciences) for further purification. Bound protein was eluted with a linear gradient of 0–0·5 M NaCl in 25 mM Tris/HCl, 1 mM EDTA, pH 6·5–9 over 100 ml. Finally, protein samples were dialysed extensively against PBS, pH 7·4, and stored at +4 °C. Protein concentrations were determined by absorption spectroscopy at 280 nm using calculated molar absorption coefficient values (Pace et al., 1995). Molecular masses of the expressed proteins were confirmed with MALDI-TOF MS (Tufts University MS facility) from protein samples in H2O.

CD spectra.
Far-UV CD data were collected with a Jasco J720 spectropolarimeter calibrated with (+)-10-camphorsulfonic acid, employing a bandwidth of 1 nm and integrated for 1 s at 0·2 nm intervals. Spectra were recorded at ambient temperature in cylindrical 0·2 mm path-length cuvettes. Sample concentrations ranged between 3 and 20 µM in 10 mM KPO4 pH 7·4. Twenty scans were averaged for each spectrum and the contribution of buffer was subtracted. Quantification of secondary structure components was performed by analysing the spectra with a combination of five deconvolution programs: Contin, Neural Network, VARSELEC, Selcon and CD Estima as described by Rich et al. (1999).

Determination of antibody titres of sera from E. faecalis infections against recombinant E. faecalis proteins.
Nine sera from patients with E. faecalis infections (seven sera from patients with E. faecalis endocarditis, one from a patient with bacteraemia and cholangitis, and another from a patient with urosepsis) that were known to have antibodies against enterococcal total proteins were included. A second set, consisting of nine sera obtained from hospitalized patients with no known enterococcal infection, was included as a non-healthy control group. The presence of antibodies against enterococcal proteins in these sera was tested in an ELISA as described by Arduino et al. (1994) and Nallapareddy et al. (2000a) with some modifications. Briefly, 20 ng of each purified enterococcal protein in 100 µl PBS was coated onto microplates (96-well, Immulon 4HBX; Thermo Labsystems) overnight at 4 °C. The plates were blocked with 1 % (w/v) BSA, 0·01 % (v/v) Tween20 in PBS at ambient temperature for 1 h and 100 µl of the sera in blocking buffer was added. Each serum was tested in triplicate with serial dilutions from 1 : 100 to 1 : 6400. Plates were incubated for 2 h at ambient temperature and washed three times with 0·01 % Tween20 in PBS. One hundred microlitres of a 1 : 10 000 dilution of horseradish peroxidase-conjugated anti-human IgG (Sigma) was added and incubated for 2 h. After three washes, antibody binding was detected with 3,3',5,5'-tetramethylbenzidine in the presence of H2O2 in 0·1 M citrate/acetate buffer, pH 6·0, at ambient temperature for 15 min. The reaction was stopped with 2 M H2SO4 and A450 was recorded. Titres were determined after subtracting A450 values from appropriate controls. To determine a cut-off level for serum titres, four additional control sera from healthy volunteers without any known prior E. faecalis infection were assayed. The sum of mean A450 values and two times the standard deviations for each dilution of the control sera were set as cut-off levels for positive titres.

Colony hybridization.
E. faecalis strains were inoculated on sterile nylon membranes placed on BHI agar plates (colony side up) and grown overnight at 37 °C. The resulting colonies were lysed on the membrane and the filters were hybridized with intragenic DNA probes obtained by PCR amplification of E. faecalis V583 genomic DNA using respective primers listed in Table 1. Radiolabelled DNA probes were prepared by random primed labelling according to the protocol supplied (RadPrime DNA labelling System; Invitrogen). Colony hybridization was carried out under high stringency conditions using previously described methods (Coque et al., 1995).


   RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Identifying putative cell-wall-anchored proteins with tandemly repeated Ig-folds
Initially, we searched the whole genome sequence of E. faecalis strain V583 for ORFs that contain a putative C-terminal CWA domain for covalent attachment to the bacterial cell wall. The ligand-binding activity of most MSCRAMMs has been localized to the N-terminal A-regions that are ~500 aa long. Therefore, we limited our search to ORFs with a minimum length of 500 nt. Twenty-nine predicted proteins that fulfil the search criteria were identified. These ORFs contained an N-terminal signal peptide sequence as well as the consensus motif LPX[TSA][GANS] near the C terminus followed by a hydrophobic region. The data published (Paulsen et al., 2003) or presented on the E. faecalis V583 genome sequence database (www.tigr.org) identified 21 putative CWA proteins. Of these ORFs, nine were also found in our search; 20 ORFs were identified by us as putative CWA proteins, but were not identified as putative CWA proteins in the database. The 12 additional putative CWA proteins identified in the TIGR database, but not in our search, were short proteins (ORF <400 nt) or contained a less-stringent LPXTG motif. By combining the results of the two searches, we identified 41 putative CWA proteins in the E. faecalis V583 genome.

The three-dimensional structures solved for the ligand-binding segments of the A-domains of staphylococcal MSCRAMMs are composed of two subdomains that both represent a variant (DEv) of the Ig-fold (Deivanayagam et al., 2002, 2000; Ponnuraj et al., 2003; Symersky et al., 1997). We therefore decided to focus on enterococcal CWA proteins containing repeated Ig-like folds. By using the 3D-PSSM-fold recognition server (Kelley et al., 2000), we initially found 14 enterococcal CWA proteins with predicted Ig-like folds among the panel of 29 putative CWA proteins identified by us. Four of these are highly conserved aggregation substance (Galli et al., 1990, 1992) homologues (77 % overall identity in multiple alignment) and one shows significant similarity (31 % identity over 1457 aa) to the biofilm-associated protein (Bap) of S. aureus (Cucarella et al., 2001). As we were focusing on novel and uncharacterized CWA proteins with MSCRAMM-like structural features, we chose the remaining nine proteins (EF0089, EF1091, EF1092, EF1093, EF1099, EF1269, EF1824, EF2224 and EF3023; TIGR annotation locus names have been used throughout this study) for further investigation (Fig. 1). All of these proteins were identified as putative CWA proteins in our search and contain the anchor motif LPX[TSA][GANS]; EF1092, EF1093, EF1269 and EF2224 were also annotated as ‘cell wall surface anchor family proteins' in the TIGR genome database. The majority of the whole length of the primary sequences in seven of the nine proteins (EF0089, EF1091, EF1092, EF1093, EF1099, EF1269 and EF2224) consist of multiple matches (>90 % e value certainty) to about 150–500 aa segments of proteins of the Ig superfamily (Fig. 1). These segments consist of tandemly repeated domain modules with Ig-like folds. In contrast, the predicted Ig-folded regions in proteins EF1824 and EF3023 are relatively short and, for EF3023, exhibited a lower potential than for an Ig-folded structure (<50 %).



View larger version (36K):
[in this window]
[in a new window]
 
Fig. 1. LPXTG motif CWA proteins with Ig-like folds from E. faecalis. S, signal peptide; W, cell-wall-spanning region; M, membrane anchor; C, cytoplasmic tail with charged residues. The triangle above the CWA domain depicts the LPXTG motif. White areas, non-repeated sequence; hatched areas, repeat regions. Regions containing Ig-like folds with over 90 % e value certainty, 50 % in the case of EF3023, are marked with black arrows.

 
Analysis of structural organization of MSCRAMM-like proteins from E. faecalis
A putative N-terminal signal peptide sequence, ranging from 25 to 42 aa was found in all nine proteins (Fig. 1). In the case of EF1091, EF0089 and EF1099, translation initiation sites 40 (codon ATT), 48 (ATG) and 53 (ATG) aa, respectively, upstream of the current genome database annotation, were considered most likely to serve as translation initiation sites as they were preceded by properly distanced Shine–Dalgarno sequences and resulted in a characteristic N-terminal signal peptide. Hence, these amino acids have been set as +1. Analyses of the primary structures of the nine enterococcal proteins revealed additional features characteristic of staphylococcal MSCRAMMs. A non-repetitive N-terminal region resembling the A-domains of MSCRAMMs is followed by sequence repeats in proteins EF0089, EF1099, EF1824, EF2224 and EF3023. At the C terminus, the nine proteins have an anchor domain consisting of a cell-wall-spanning region with the LPXTG motif, a hydrophobic transmembrane sequence followed by a short stretch of amino acids with a high proportion of positively charged residues, presumably representing a cytoplasmic domain.

The A-regions of the previously characterized S. aureus MSCRAMMs are typically ~500 aa long (445–625 aa) and appear to consist of three subdomains with Ig-like folds: N1, N2 and N3 (Deivanayagam et al., 2002; Perkins et al., 2001; Ponnuraj et al., 2003). The N-terminal regions in proteins EF1092, EF1093, EF1099, EF1269 and EF2224 are between 336 and 741 aa in length and could contain 2–5 Ig-like motifs. The four remaining proteins, EF0089, EF1091, EF1824 and EF3023, have larger non-repetitive regions, between 999 and 1787 aa, and could fold into several additional subdomains.

The A-regions of EF1099 (=ACE) (Rich et al., 1999) and EF1269 show the highest primary sequence similarity to the A-regions of the staphylococcal MSCRAMMs among the nine proteins. The A-region of ACE consists of a primary sequence (residues 32–367) that has 27 % identity to the collagen-binding A-region subdomains N1 and N2 of CNA (residues 32–358) (Symersky et al., 1997). EF1269 contains a corresponding N-terminal segment (residues 53–377) with highest similarity to the fibrinogen-binding N2 and N3 subdomains of the staphylococcal MSCRAMMs ClfA, ClfB and SdrG (Deivanayagam et al., 2002; Perkins et al., 2001; Ponnuraj et al., 2003) as well as to the putative N2 and N3 subdomains of SdrD and SdrE (McCrea et al., 2000). Pairwise amino acid identities in this region range from 21 to 30 % between any of the four proteins ClfA, ClfB, SdrG and EF1269. A conserved TYTFTDYVD-like motif is found in one of the two Ig-repeats (N1 in CNA, N2 in others) that form the ligand-binding region in staphylococcal MSCRAMMs (Josefsson et al., 1998; Ponnuraj et al., 2003). In the ‘dock, lock and latch’ ligand-binding model, this motif forms a cleft that accommodates the C-terminal latch extension of the neighbouring Ig-folded subdomain (Ponnuraj et al., 2003). A sequence, TLTYTDYVE, 78 % identical to the TYTFTDYVD-like motifs present in staphylococcal MSCRAMMs, is found in the first Ig-fold region of EF1269. The corresponding N1 domain of ACE contains a similar sequence, VATFNEKVE. Both of these sequences aligned perfectly with the TYTFTDYVD-like motifs in multialignments. Secondary structure alignments give further confirmation to the relatedness; locations of predicted {beta}-sheet strands in EF1269 aligned with those of the staphylococcal MSCRAMMs. Because of these similarities, it appears likely that the N-terminal segments of EF1269 and ACE adopt the DEv variant of the Ig-fold that was found in the N2N3 domains of ClfA and SdrG (Deivanayagam et al., 2002; Ponnuraj et al., 2003). However, both EF1269 and ACE resemble CNA in lacking the first Ig-domain found in the other MSCRAMMs. A third potential MSCRAMM N2N3-like domain was revealed by 3D-PSSM fold recognition analysis of EF0089. This region (aa 220–559) presented an equally significant structural match to the crystallized ClfA N2N3 domain (e=1·7x10–8) as found in SdrG N2N3 (e=1·7x10–8) and ClfB N2N3 (e=8·6x10–10) when they were submitted as query sequences. The potential EF0089 N2N3 domain is similarly located near the N terminus as are the N2 and N3 subdomains of known MSCRAMMs. Furthermore, it contains the sequence RFTFNERIT which aligns perfectly with the ‘latching cleft’-like sequences of MSCRAMMs.

Protein BLAST searches revealed that EF1091, EF1092, EF1093 and EF2224 have some similarity to streptococcal and staphylococcal MSCRAMMs although these relationships are more distant than those described above. Proteins EF1824 and EF3023 possibly contain domains with catalytic activities. A search for conserved domains with reverse position specific BLAST revealed a 662 residue domain of {alpha}-glucosidases (family 31 of glycosyl hydrolases, COG1501 in the NCBI database of conserved domains) between residues 99 and 760 in the N-terminal region of EF1824 (e=1x10–69). A 251 residue conserved domain of secreted bacterial lyases (polysaccharide lyase family 8, pfam02278) involved in the breakdown of hyaluronan and chondroitin in the ECM of host tissues was identified between residues 636 and 886 of EF3023 (e=1x10–40).

Proteins EF0089, EF1099, EF1824, EF3023 and EF2224 contain C-terminal regions with tandemly repeated motifs
The C terminus of EF1824 contains six repeats of 69–73 residues with pairwise identities ranging from 23·6 to 39·7 % (mean 32·7 %) (Fig. 1). The repetitive region of protein EF0089 consists of nine 73–102 aa repeats, which are well conserved in their N-terminal one-third, but show receding homology toward the C-terminal ends. In contrast to the low-homology repeats in EF0089 and EF1824, the repeats in proteins EF1099, EF2224 and EF3023 are highly conserved. The three consecutive 71 aa repeats of EF3023 share 94 % overall identity in multiple alignment followed by a similarly conserved truncated repeat of 52 aa. EF2224 contains five consecutive full-length 131 aa repeats (89 % identity in CLUSTALW multialignment) and ACE (EF1099) has four full repeats of 47 aa with 87 % overall identity preceded by a 19 aa truncated repeat. The repeats in the ace gene are flanked by a recer sequence (Nallapareddy et al., 2000a; Rich et al., 1999), a possible recombination site for module rearrangements, but this sequence was not found in the genetic element encoding repeats of proteins EF0089, EF1824, EF2224 or EF3023.

Six of the nine proteins have perfectly conserved canonical LPXTG motifs. The remaining three have anchor motifs identical to the broader consensus LPX[TSA][GANS]; EF1824 contains LPKAN, and EF0089 as well as EF1092 contain LPKTN.

Expression, purification and characterization of putative A-regions
We cloned the DNA segments encoding the N-terminal regions of the enterococcal proteins that correspond to the A-domains of MSCRAMMs into the expression vector pQE-30 (Qiagen), which allows the production of recombinant fusion proteins with an N-terminal His6 tag (Fig. 1, Table 1). Because of the large size of the putative A-domain of protein EF1824, this domain was expressed as two segments; the N-terminal region (amino acids 43–819) predicted to contain a conserved domain of bacterial glucosyl hydrolases (AI) and the C-terminal region (820–1829) (AII) were cloned separately. The A-region of ACE from strain EF1, which we previously have expressed (Rich et al., 1999), was used in these studies instead of the A-region of protein EF1099 (strain V583) as these two ACE variants share 100 % amino acid identity in this region. The Escherichia coli-expressed fusion proteins were initially purified on a nickel-charged iminodiacetic acid-Sepharose column and further purified by anion-exchange chromatography. This purification protocol is efficient and allowed the isolation of essentially pure proteins. As seen in Fig. 2, the A-regions migrated in SDS-PAGE gels as predicted from the calculated molecular masses with the exception of the A-region of EF1091 which migrated as a ~160 kDa protein whereas the calculated molecular mass is 113 kDa. Some bands of lower molecular size, possibly representing degradation products, were observed in the preparations of proteins EF0089, EF1091, EF1824 and EF3023, but these recombinant proteins were nevertheless estimated to be >90 % pure. The region 42–819 of EF1824 (Fig. 1) was found in the insoluble fraction of the Escherichia coli lysate and this domain was not further examined.



View larger version (44K):
[in this window]
[in a new window]
 
Fig. 2. Coomassie-stained SDS-PAGE of the Escherichia coli-expressed and purified A-domains of E. faecalis LPXTG proteins. Lanes: a, EF1091A; b, EF1824AII; c, EF0089A; d, EF3023A; e, EF1092A; f, EF2224A; g, EF1269A; h, ACEA (EF1099); i, EF1093A.

 
The purified recombinant A-region proteins were further characterized by MALDI-TOF MS. All nine proteins, including EF1091 which showed an aberrant migration in SDS-PAGE, gave peaks that were in good agreement with the molecular masses calculated from amino acid sequences (Table 2) and thus indicated that full-size proteins had been produced with no post-translational processing.


View this table:
[in this window]
[in a new window]
 
Table 2. Molecular size analysis

 
The secondary structures of the expressed proteins were predicted based on their amino acid sequences. Eight algorithms were initially evaluated for predicting the {alpha}-helix and {beta}-sheet content of segments of the A-domains of the staphylococcal MSCRAMMs CNA, ClfA and SdrG for which crystal structure data have been reported (Deivanayagam et al., 1999; Ponnuraj et al., 2003; Symersky et al., 1997). The results showed high variation with the algorithms PHD, DSC and SOPM (Geourjon & Deleage, 1994; King & Sternberg, 1996; Rost & Sander, 1993) being most accurate when aligned with corresponding crystallography data. The mean deviation from solved structures for Cna, SdrG and ClfA was 0·4, 2·2 and 2·8 % for {alpha}-helices, and 1·9, 0·6 and 3·3 % for {beta}-sheet, respectively. Since these three algorithms showed the best potential for reliable secondary structure prediction of Ig-folded MSCRAMM proteins, we decided to use them for the analysis of the nine enterococcal proteins of this study. Seven A-domain regions of the nine proteins (EF0089A, EF1091A, EF1092A, EF1093A, EF1099A, EF1269A and EF2224A) (Table 1, Fig. 1) were predicted to be primarily composed of {beta}-sheet (32–47 %) with a smaller percentage of {alpha}-helix (7–15 %) (Table 3). The expressed segments of EF1824 (AII) and EF3023 (A), on the other hand, showed relatively high {alpha}-helical content (45 and 24 %). These sequence prediction results were compared with deconvoluted far-UV CD spectra. Spectra were deconvoluted using the programs Contin, Neural Network, VARSELEC, Selcon and CD Estima (Rich et al., 1999) and the contents of {beta}-sheet and {alpha}-helix scored by the different programs were averaged. The CD spectra indicated a slightly higher amount of {beta}-sheet (36–57 %) and lower content of {alpha}-helix (3–16 %) in the expressed A-regions of EF0089, EF1091, EF1092, EF1093, EF1099, EF1269 and EF2224 compared to those predicted from the primary sequence. The observed high proportion of {beta}-sheet agrees well with known MSCRAMM structures and gives physical support for the predicted Ig-folds in the enterococcal proteins. The recorded spectra also supported the predicted high {alpha}-helix content in the expressed segments of EF1824 (AII) (29 %) and EF3023 (A) (33 %). The secondary structure composition of the ACE A-region determined from its deconvoluted CD spectrum perfectly matches that published previously (Rich et al., 1999).


View this table:
[in this window]
[in a new window]
 
Table 3. Summary of secondary structure components

 
Genes encoding the MSCRAMM-like proteins are present in most strains of E. faecalis and the proteins are expressed in vivo
The presence of genes encoding the nine CWA proteins was examined in E. faecalis isolates from nine endocarditis and 21 other clinical isolates by colony hybridization with PCR-generated probes using primers listed in Table 1. With the exception of EF1824 and EF3023, the remaining seven genes were found in all 30 isolates tested (Table 4). EF1824 was present at a frequency of 16/30 (53 %) and EF3023 at 23/30 (77 %) strains. No large differences between endocarditis and other clinical isolates were observed.


View this table:
[in this window]
[in a new window]
 
Table 4. Distribution of genes encoding the nine EF proteins among E. faecalis isolates

 
Sera from two groups of nine hospitalized patients, one with and the other without a previously documented E. faecalis infection, were tested and the titres of antibodies against the nine enterococcal proteins were determined. As seen in Fig. 3, sera from infected individuals more frequently show positive titres to the A-domains of the nine EF proteins than the sera from the non-infected control group, indicating expression in vivo during an E. faecalis infection. Proteins EF1091, EF1092, EF1093 and EF2224 are expressed at higher levels during the infection process or are more immunogenic than the other proteins as five to nine of the nine patient sera showed positive titres. ACE is frequently expressed in infected human hosts as demonstrated in an earlier study using the same serum collection (Nallapareddy et al., 2000a).



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 3. Distribution of IgG titres against the MSCRAMM-like proteins in human sera. Nine sera from patients with a previous E. faecalis infection and another nine sera from a control group were tested for antibody titres as described in Methods.

 

   DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
We have identified 14 putative CWA proteins with tandemly repeated Ig-like folds in E. faecalis among the panel of 29 putative CWA proteins with an LPX[TSA][GANS] anchoring sequence. If we expand our analyses to include the 12 additional ORFs annotated by TIGR as CWA proteins, three more potential MSCRAMMs are identified based on the predicted presence of repeated Ig-like folds. These ORFs, EF1896, EF2347 and EF2505, contain the CWA motifs FPQTG or YPKTG and hence were not picked up in our screen. Although these three proteins have not yet been expressed and characterized, they should be included in future analyses of putative MSCRAMMs of E. faecalis. We report here on the examination of eight previously uncharacterized CWA proteins with a more stringent LPXTG motif (Fig. 1). Six of these are predicted to consist mainly of Ig-like folds and have a structural organization characteristic of staphylococcal MSCRAMMs. Hence, these proteins are MSCRAMM candidates potentially interacting with host protein/peptide ligands. The remaining two proteins, EF1824 and EF3023, appear to fit the MSCRAMM model less well as their predicted Ig-fold regions cover less than a quarter of the primary sequences, and because of their high content of {alpha}-helix secondary structures and significant sequence similarity to known enzymes. Therefore, these two may be cell-wall-anchored proteins with catalytic functions.

In addition to the six novel putative enterococcal MSCRAMM proteins, ACE (EF1099), which can act as a collagen/laminin-binding adhesin (Nallapareddy et al., 2000b; Rich et al., 1999), was also identified in our search and characterized for comparison. We have recently been able to crystallize the A-region of ACE (Ponnuraj et al., 2002). Preliminary analyses of the crystal structure (Ponnuraj et al., in preparation) reveal two domains (N1 and N2), both representing the recently described DEv variant of the Ig-like fold initially found in the A-regions of staphylococcal MSCRAMMs (Deivanayagam et al., 2002; Ponnuraj et al., 2003; Symersky et al., 1997). Furthermore, the first Ig domain (N1) in ACE contains the sequence VATFNEKVE which is found in a cleft similar to the TYTFTDYVD-containing latching cleft of the staphylococcal SdrG. EF1269 also appears to be composed of two N-terminal domains where the first contains a sequence TLTYTDYVE at a similar position as the cleft motif in SdrG. We therefore propose that ACE (EF1099) and EF1269 may be protein/peptide-binding MSCRAMMs that bind their ligands by a similar ‘dock, lock and latch’ mechanism as proposed for SdrG. This may also hold for EF0089 and the so far uncharacterized EF2505. Both show lower primary sequence similarities with the two ligand-binding subdomains of staphylococcal MSCRAMMs (N1 and N2 in CNA, N2 and N3 in others) than ACE and EF1269 (<20 % identity), but share equally strong structural similarity with the ClfA N2N3 fold when examined using the 3D-PSSM fold recognition server (Kelley et al., 2000). Furthermore, both proteins have potential TYTFTDYVD-like cleft motifs in conserved positions in their putative N2 subdomains. Interestingly, the not-yet-expressed proteins EF1896 and EF2347 also have TYTFTDYVD-like motifs in their N-terminal regions. Therefore, it is tempting to speculate that these proteins utilize the conserved ‘dock, lock and latch’ ligand-binding mechanism. In summary, we have identified 15 putative CWA proteins predicted to consist mainly of Ig-folds in the E. faecalis genome; at least six of these contain TYTFTDYVD-like sequences in their N-terminal regions.

Apparently, the seven genes encoding MSCRAMM-like proteins (EF0089, EF1091, EF1092, EF1093, EF1099, EF1269 and EF2224) examined in this study are ubiquitous among E. faecalis strains as shown by 100 % preservation in all strains studied (Table 4). Genes for the two putative cell-wall enzymes (EF1824 and EF3023) were present in most strains as well (53 and 77 %, respectively). We also demonstrated that these nine proteins were expressed in vivo in humans during an E. faecalis infection. Interestingly, the genes encoding the three proteins (EF1091, EF1092 and EF1093) with the highest titres (Fig. 3) are organized as a cluster, which is preceded by two putative promoter consensus regions and a well conserved ribosome-binding site. Thus, these genes are apparently encoded by an operon in the E. faecalis genome and are likely to be co-transcribed. The next gene downstream, EF1094, is preceded by a separate promoter consensus sequence and encodes a putative sortase. ACE (EF1099) is also closely linked to this region. This raises the possibility that this cluster of MSCRAMM-like proteins and a sortase for their cell-wall coupling is co-regulated.

The family of putative MSCRAMMs identified in this study may contribute to the pathogenic potential of E. faecalis. Future studies of the individual MSCRAMMs will define their roles in the infection process, such as adhesion to host molecules, and hopefully identify targets for preventive and therapeutic strategies. For example, the use of antibodies targeting the binding epitopes of bacterial MSCRAMMs, is a promising novel approach to prevent and treat infections caused by Gram-positive bacteria (Hall et al., 2003; Vernachio et al., 2003).


   ACKNOWLEDGEMENTS
 
This work was funded by a grant from Inhibitex Inc., and NIH grants AI 20624 and AR 44415 to M. H. and R37 AI 47923 to B. E. M.


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Arduino, R. C., Murray, B. E. & Rakita, R. M. (1994). Roles of antibodies and complement in phagocytic killing of enterococci. Infect Immun 62, 987–993.[Abstract]

Coque, T. M., Patterson, J. E., Steckelberg, J. M. & Murray, B. E. (1995). Incidence of hemolysin, gelatinase, and aggregation substance among enterococci isolated from patients with endocarditis and other infections and from feces of hospitalized and community-based persons. J Infect Dis 171, 1223–1229.[Medline]

Cucarella, C., Solano, C., Valle, J., Amorena, B., Lasa, I. & Penades, J. R. (2001). Bap, a Staphylococcus aureus surface protein involved in biofilm formation. J Bacteriol 183, 2888–2896.[Abstract/Free Full Text]

Deivanayagam, C. C., Perkins, S., Danthuluri, S., Owens, R. T., Bice, T., Nanavathy, T., Foster, T. J., Hook, M. & Narayana, S. V. (1999). Crystallization of ClfA and ClfB fragments: the fibrinogen-binding surface proteins of Staphylococcus aureus. Acta Crystallogr D Biol Crystallogr 55, 554–556.[CrossRef][Medline]

Deivanayagam, C. C., Rich, R. L., Carson, M., Owens, R. T., Danthuluri, S., Bice, T., Hook, M. & Narayana, S. V. (2000). Novel fold and assembly of the repetitive B region of the Staphylococcus aureus collagen-binding surface protein. Structure Fold Des 8, 67–78.[Medline]

Deivanayagam, C. C., Wann, E. R., Chen, W., Carson, M., Rajashankar, K. R., Hook, M. & Narayana, S. V. (2002). A novel variant of the immunoglobulin fold in surface adhesins of Staphylococcus aureus: crystal structure of the fibrinogen-binding MSCRAMM, clumping factor A. EMBO J 21, 6660–6672.[Abstract/Free Full Text]

Galli, D., Lottspeich, F. & Wirth, R. (1990). Sequence analysis of Enterococcus faecalis aggregation substance encoded by the sex pheromone plasmid pAD1. Mol Microbiol 4, 895–904.[Medline]

Galli, D., Friesenegger, A. & Wirth, R. (1992). Transcriptional control of sex-pheromone-inducible genes on plasmid pAD1 of Enterococcus faecalis and sequence analysis of a third structural gene for (pPD1-encoded) aggregation substance. Mol Microbiol 6, 1297–1308.[Medline]

Geourjon, C. & Deleage, G. (1994). SOPM: a self-optimized method for protein secondary structure prediction. Protein Eng 7, 157–164.[Medline]

Hall, A. E., Domanski, P. J., Patel, P. R. & 7 other authors (2003). Characterization of a protective monoclonal antibody recognizing Staphylococcus aureus MSCRAMM protein clumping factor A. Infect Immun 71, 6864–6870.[Abstract/Free Full Text]

Huycke, M. M., Sahm, D. F. & Gilmore, M. S. (1998). Multiple-drug resistant enterococci: the nature of the problem and an agenda for the future. Emerg Infect Dis 4, 239–249.[Medline]

Josefsson, E., McCrea, K. W., Ni Eidhin, D., O'Connell, D., Cox, J., Hook, M. & Foster, T. J. (1998). Three new members of the serine-aspartate repeat protein multigene family of Staphylococcus aureus. Microbiology 144, 3387–3395.[Abstract]

Kelley, L. A., MacCallum, R. M. & Sternberg, M. J. E. (2000). Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol 299, 499–520.[Medline]

King, R. D. & Sternberg, M. J. (1996). Identification and application of the concepts important for accurate and reliable secondary structure prediction. Protein Sci 5, 2298–2310.[Abstract/Free Full Text]

Mazmanian, S. K., Ton-That, H. & Schneewind, O. (2001). Sortase-catalysed anchoring of surface proteins to the cell wall of Staphylococcus aureus. Mol Microbiol 40, 1049–1057.[CrossRef][Medline]

McCrea, K. W., Hartford, O., Davis, S., Eidhin, D. N., Lina, G., Speziale, P., Foster, T. J. & Hook, M. (2000). The serine-aspartate repeat (Sdr) protein family in Staphylococcus epidermidis. Microbiology 146, 1535–1546.[Abstract/Free Full Text]

Murray, B. E. (1990). The life and times of the enterococcus. Clin Microbiol Rev 3, 46–65.[Medline]

Murray, B. E. & Weinstock, G. M. (1999). Enterococci: new aspects of an old organism. Proc Assoc Am Physicians 111, 328–334.[Medline]

Nallapareddy, S. R., Singh, K. V., Duh, R. W., Weinstock, G. M. & Murray, B. E. (2000a). Diversity of ace, a gene encoding a microbial surface component recognizing adhesive matrix molecules, from different strains of Enterococcus faecalis and evidence for production of ace during human infections. Infect Immun 68, 5210–5217.[Abstract/Free Full Text]

Nallapareddy, S. R., Qin, X., Weinstock, G. M., Hook, M. & Murray, B. E. (2000b). Enterococcus faecalis adhesin, ace, mediates attachment to extracellular matrix proteins collagen type IV and laminin as well as collagen type I. Infect Immun 68, 5218–5224.[Abstract/Free Full Text]

Pace, C. N., Vajdos, F., Fee, L., Grimsley, G. & Gray, T. (1995). How to measure and predict the molar absorption coefficient of a protein. Protein Sci 4, 2411–2423.[Abstract/Free Full Text]

Pallen, M. J., Lam, A. C., Antonio, M. & Dunbar, K. (2001). An embarrassment of sortases – a richness of substrates? Trends Microbiol 9, 97–102.[CrossRef][Medline]

Patti, J. M. & Hook, M. (1994). Microbial adhesins recognizing extracellular matrix macromolecules. Curr Biol 6, 752–758.

Paulsen, I. T., Banerjei, L., Myers, G. S. & 29 other authors (2003). Role of mobile DNA in the evolution of vancomycin-resistant Enterococcus faecalis. Science 299, 2071–2074.[Abstract/Free Full Text]

Perkins, S., Walsh, E. J., Deivanayagam, C. C., Narayana, S. V., Foster, T. J. & Hook, M. (2001). Structural organization of the fibrinogen-binding region of the clumping factor B MSCRAMM of Staphylococcus aureus. J Biol Chem 276, 44721–44728.[Abstract/Free Full Text]

Ponnuraj, K., Xu, Y., Moore, D., Deivanayagam, C. C., Boque, L., Hook, M. & Narayana, S. V. (2002). Crystallization and preliminary X-ray crystallographic analysis of Ace: a collagen-binding MSCRAMM from Enterococcus faecalis. Biochim Biophys Acta 1596, 173–176.[Medline]

Ponnuraj, K., Bowden, G., Davis, S., Gurusiddappa, S., Moore, D., Choe, D., Xu, Y., Hook, M. & Narayana, S. V. (2003). A ‘dock, lock, and latch’ structural model for a staphylococcal adhesin binding to fibrinogen. Cell 115, 217–228.[Medline]

Rich, R. L., Kreikemeyer, B., Owens, R. T., LaBrenz, S., Narayana, S. V., Weinstock, G. M., Murray, B. E. & Hook, M. (1999). Ace is a collagen-binding MSCRAMM from Enterococcus faecalis. J Biol Chem 274, 26939–26945.[Abstract/Free Full Text]

Rost, B. & Sander, C. (1993). Prediction of protein secondary structure at better than 70 % accuracy. J Mol Biol 232, 584–599.[CrossRef][Medline]

Rozdzinski, E., Marre, R., Susa, M., Wirth, R. & Muscholl-Silberhorn, A. (2001). Aggregation substance-mediated adherence of Enterococcus faecalis to immobilized extracellular matrix proteins. Microb Pathog 30, 211–220.[CrossRef][Medline]

Sahm, D. F., Kissinger, J., Gilmore, M. S., Murray, P. R., Mulder, R., Solliday, J. & Clarke, B. (1989). In vitro susceptibility studies of vancomycin-resistant Enterococcus faecalis. Antimicrob Agents Chemother 33, 1588–1591.[Medline]

Schneewind, O., Model, P. & Fischetti, V. A. (1992). Sorting of protein A to the staphylococcal cell wall. Cell 70, 267–281.[Medline]

Styriak, I., Laukova, A., Fallgren, C. & Wadstrom, T. (1999). Binding of selected extracellular proteins to enterococci and Streptococcus bovis of animal origin. Curr Microbiol 39, 327–335.[CrossRef][Medline]

Styriak, I., Laukova, A. & Ljungh, A. (2002). Lectin-like binding and antibiotic sensitivity of enterococci from wild herbivores. Microbiol Res 157, 293–303.[Medline]

Symersky, J., Patti, J. M., Carson, M. & 8 other authors (1997). Structure of the collagen-binding domain from a Staphylococcus aureus adhesin. Nat Struct Biol 4, 833–838.[Medline]

Tailor, S. A., Bailey, E. M. & Rybak, M. J. (1993). Enterococcus, an emerging pathogen. Ann Pharmacother 27, 1231–1242.[Abstract]

Vernachio, J., Bayer, A. S., Le, T. & 7 other authors (2003). Anti-clumping factor A immunoglobulin reduces the duration of methicillin-resistant Staphylococcus aureus bacteremia in an experimental model of infective endocarditis. Antimicrob Agents Chemother 47, 3400–3406.[Abstract/Free Full Text]

Xiao, J., Hook, M., Weinstock, G. M. & Murray, B. E. (1998). Conditional adherence of Enterococcus faecalis to extracellular matrix proteins. FEMS Immunol Med Microbiol 21, 287–295.[CrossRef][Medline]

Zareba, T. W., Pascu, C., Hryniewicz, W. & Wadstrom, T. (1997). Binding of extracellular matrix proteins by enterococci. Curr Microbiol 34, 6–11.[CrossRef][Medline]

Received 2 February 2004; revised 30 March 2004; accepted 5 April 2004.



This Article
Abstract
Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Citation Map
Services
Email this article to a friend
Similar articles in this journal
Similar articles in PubMed
Alert me to new issues of the journal
Download to citation manager
Google Scholar
Articles by Sillanpää, J.
Articles by Höök, M.
Articles citing this Article
PubMed
PubMed Citation
Articles by Sillanpää, J.
Articles by Höök, M.
Agricola
Articles by Sillanpää, J.
Articles by Höök, M.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
INT J SYST EVOL MICROBIOL MICROBIOLOGY J GEN VIROL
J MED MICROBIOL ALL SGM JOURNALS
Copyright © 2004 Society for General Microbiology.