National Centre for Biological Sciences, Tata Institute of Fundamental Research, UAS-GKVK Campus, Bangalore 560 065, India
1 To whom correspondence should be addressed.E-mail: mini{at}ncbs.res.in
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: distant similarity/intermediate sequences/N-acetyl transferase/OHHL synthase/protein structure prediction/quorum sensing
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
AHL synthase superfamily members were identified in five other species by means of a lux-plasmid bioluminescent sensor for OHHL and by gene complementation studies (Swift et al., 1993). Some of these proteins differ in their substrate specificity and activate diverse biological events but in a similar cell density-dependent manner. The plant pathogen Agrobacterium tumefaciens requires the conjugal transfer factor of OOHL (N-3-oxooctanoyl-L-homoserinelactone) produced by the traI gene product (Fuqua and Winans, 1994
). In Pseudomonas aeruginosa, the production of exoproducts such as elastase is regulated via quorum sensing and two pairs of LuxRI homologues have been identified, i.e. RhlRI and LasRI (Jones et al., 1993
; Latifi et al., 1995
). The major signal molecule produced via RhlI and LasI, respectively, are N-butanoyl-L-homoserine lactone (BHL) and N-(3-oxododecanoyl)-L-homoserine lactone (OdDHL). Some strains of the Gram-negative bacteria such as Erwinia carotovora make the simple carbapenem antibiotic via the CarRI system. CarI and EagI of Enterobacter agglomerans produce the freely diffusible molecule OHHL (Swift et al., 1993
). Subsequently, 3540 AHL synthase proteins have been identified [review (Swift et al., 1996
)]. Our earlier structure prediction studies, like fold recognition and three-dimensional modelling on EagI, suggested that the AHL synthase superfamily members are compatible with the N-acetyltransferase fold (NAT) (S.Chakrabarti and R.Sowdhamini, unpublished results).
Recently, the crystal structure of EsaI (one of the AHL synthase superfamily) has shown that these proteins adopt the fold observed in N-acetyltransferases (Watson et al., 2002). The structures of the NAT superfamily (e.g. PDB codes, 1qst, 1b87 and 1cjw) reveal an alpha- and beta-fold for this superfamily and a central V-shaped cavity which forms the acetyl-CoA (AcCoA) and coenzyme binding site. Despite the similarity in their AcCoA-binding regions, the three structures differ significantly in other regions presumed to be involved in binding the substrate to be acetylated.
The present study involved the sequence and structural analyses of AHL synthase superfamily members. We show, by recursive PSI-BLAST searches (Altschul et al., 1997) and by the identification of intermediate sequences (ISS), that the structural similarity observed between AHL synthase and NAT superfamilies may have evolutionary meaning. We further propose that the intermediate sequences could effectively be a connecting link between the two superfamilies. The crystal structure of EsaI (Watson et al., 2002
) has enabled us to obtain three-dimensional models for the other AHL synthase superfamily members and to examine the spatial positions of putative functionally important residues identified by evolutionary trace methods. Evolutionary class-specific clusters point to several interesting residue substitutions in the hydrophobic region of the substrate-binding site that might account for their substrate specificity.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
For the recursive PSI-BLAST runs, sequences from NAT (1cjw, 1qst, 1b87) and AHL synthase (EagI, EsaI, CarI, LuxI, RhlI, TraI and LasI) superfamilies were used as queries to search against the non-redundant sequence database. Subsequently, each hit identified was passed through a further round of PSI-BLAST. The intention behind such searches was to identify intermediate search sequences (ISS) that are similar to both superfamily members.
Hierarchical clustering and dendrogram construction were performed for AHL synthases, NAT representatives and ISS using sequence dissimilarity measures and PHYLIP3.5 (Felsenstein, 1985). Non-redundant homologues of the three sets of proteins (at 90% identity cut-off), obtained by PSI-BLAST, were included in an expanded multiple alignment using MALIGN (Johnson et al., 1993
). From the multiple alignment, distances based on sequence similarity were extracted and used as an input to the principal component analysis (PCA) (M.Johnson, unpublished results) program. Inter-sequence similarity profiles are represented in a three-dimensional projection where each point in the three-dimensional plot represents a particular sequence (Figure 1a
). PCA involves a mathematical procedure that transforms a number of (possibly) correlated variables into a smaller number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible and each succeeding component accounts for as much of the remaining variability as possible.
|
Three-dimensional modelling was performed using MODELLER (Sali and Blundell, 1993) and subsequently loop re-modelling was performed using COMPOSER (Sutcliffe et al., 1987a
,b
). The crystal structure of EsaI (Watson et al., 2002
) served as the structural template for EagI, CarI, LuxI, RhlI and LasI. Models were energy minimized using standard TRIPOS parameters and validated using VERIFY3D (Eisenberg et al., 1997
). The crystal structure of serotonin acetyltransferase [1cjw (Hickman et al., 1999
)] when used as the structural template gave rise to the best ISS models (for C82060 and BAB06701.1) and 1b87 was similarly used as the template for T34942.
The different substrates for AHL synthases were modelled starting from a phosphopantetheine (PNT) backbone using CHNGEN (C.Ramakrishnan, unpublished results). The docking of S-adenosylmethionine (SAM) and the corresponding substrates was performed using GRAMM (Katchalski-Katzir et al., 1992; Vakser, 1995
). Acyl-ACP was first docked to the protein model followed by modelling the interaction with SAM. Subsequently ligands of the template NAT structures were docked to the ISS models. Selected docked models were further refined using an energy-driven docking procedure in the SYBYL package (Tripos Associates). The final docked structures were examined for non-bonded energies, possible short contacts and hydrophobic interactions at the binding site. Residues in contact with SAM and acyl-PNT were identified by a liberal distance cut-off of 8 Å to account for possible structural alterations at the protein backbone that can occur during ligand binding.
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Thirty-five members of AHL synthase superfamily identified were closely related to each other (2557% sequence identity). More than 50 NAT members (1318%) and 25 homologues of ISS (2850%) were also identified. Both NAT and AHL synthase superfamilies catalyse similar biochemical reactions although they differ in the nature of the substrates. Moreover, the two superfamilies have similar folds (Watson et al., 2002), very similar active sites and ligand-binding clefts. The r.m.s.d. value between serotonin acetyltransferase (PDB code, 1cjw) and EsaI (PDB code, 1kzf) after the best superimposition is 1.8 Å in spite of poor sequence identity (11%).
Search for intermediate sequences
We investigated the possibility that the similarities observed would imply common ancestry or evolutionary origin. Through recursive PSI-BLAST searches, three sequences (C82060, BAB06701.1 and T34942) were identified that share distant similarity with both NAT and AHL synthase superfamilies. These intermediate search sequences (ISS), although of bacterial origin, are not functionally annotated as belonging to either NAT or AHL synthase superfamily. C82060 is an elaA protein in one of the chromosomes of the O1 strain of Vibrio cholerae. BAB06701.1 is from Bacillus halodurans and is a conserved unknown protein; T34942 is a hypothetical protein from Streptococcus coelicolor. Table I provides the cut-off parameters and the PSI-BLAST iteration number at which these similarities could be recognized and provides pairwise sequence identities. The mean sequence identity between the three ISS is 37%. Twenty-five sequence homologues of ISS were obtained starting from the ISS sequences where the sequence identity with the three ISS ranges from 28 to 50%. Figure 1a
shows the principal component analysis of the three sets of proteins (NATs, AHL synthase and ISS) along with their respective homologues. Table II
summarizes the fold prediction results of ISS using three different methods [3DPSSM (Kelley et al., 2000
), GenThreader (Jones, 1999b
) and BIOINBGU (Fischer and Eisenberg, 1996
)]. The top-ranking scores by three independent methods relate ISS to NAT fold ensuring that these sequence connections are true positives. The absence of AHL superfamily members in these predictive exercises is due to the recent determination of structural information for this superfamily and therefore not being considered in the fold library. Furthermore, fold prediction exercises obtained by using other ISS homologues as query sequences are likely to yield similar results. This was confirmed by performing fold prediction for five most distantly related ISS homologues. In each instance, the predicted folds were NAT-fold members.
|
|
Secondary structural equivalences between the two superfamilies and the intermediate sequences are fairly high. Pairwise alignments of the two superfamilies and the intermediate sequences annotated for predicted secondary structural positions show above 75% secondary structural conservation (data not shown). Figure 2 shows the multiple sequence alignment of EagI and its homologues (AHL synthases), three structural representatives from the NAT superfamily (PDB codes 1cjw, 1b87 and 1qst) and three ISS. The observed secondary structures in NAT members and in EsaI have been projected on this alignment
|
Sequence similarity scores between AHL synthase, NATs and three ISS show a more conserved N-terminal part compared with their C-terminal half (data not shown), suggesting that the binding of SAM and AcCoA is common between these proteins and acyl-PNT-binding region is variable and class-specific. Evolutionary trace analyses (Litcharge et al., 1996; Innis et al., 2000
) were performed on NAT and AHL synthase superfamilies and ISS along with their respective homologues to identify invariant and class-specific residues. This analysis was not applied to a joint alignment across superfamilies since the method is only effective in classifying amino acid replacements amongst protein sequences where the sequence identity is reasonably high (3070%).
Invariant and class-specific ET-identified residue patches are marked in the alignment (see Figure 2). These residues occur in six equivalent regions when compared between the three superfamilies. ISS are similar to AHL synthase superfamily members with a conserved Arg and Phe in Region I, LFGI/W motif in Region II and RLL in Region III. This suggests that there are additional players in ISS associated with a NAT-like function. Perhaps ISS is involved in proteinprotein interactions, in region II, analogous to ACP binding of AHL synthases. Region VI shows high general conservation across all the members in the alignment (e.g. RXG
, where
corresponds to a hydrophobic residue). The last 2025 residues in the C-terminal half of the AHL synthase superfamily members may not be critical for their enzymatic activity as suggested by deletion mutation studies (Swift et al., 1993
). Interestingly, residues in this region are identified as neither invariant nor class-specific within the AHL superfamily by the ET method.
Amongst the AHL synthase superfamily members, ET-identified invariant residues such as Arg in alignment positions 30, 84 and 118 and negatively charged residues in Region I (alignment positions, 53, 55 and 58) could be critical for function. Interestingly, single mutations in all these corresponding positions in LuxI have rendered the protein inactive (Hanzelka et al., 1997). Invariant ET-residues in other regions (including Arg at alignment position 38 in Region I, positions 115, 117 and 119 in Region IV, positions 201 and 202 following Region V) could also be important for binding to the other substrate. RhlI, which is involved in the synthesis of BHL, shows anomalous behavior in the invariant residues. Residues which remain invariant in the other AHL synthase superfamily members, such as Ser82, Ser116, Phe119, Ser/Thr140 and Thr161 (alignment position number), are replaced by Cys, Leu, Tyr, Ala and Ala in RhlI, respectively.
Our evolutionary trace analysis identifies sets of class-specific residues within AHL synthases (Table III). As seen by independent docking studies of the ligands to the models (see later), these residues are within ligand-interacting distances. The class-specific amino residues in Region III are perhaps serving as a secondary interaction site for substrate (acyl-PNT) binding. In the final docked models, Region III is within interacting distance of the variable substrate (indicated in bold in the alignment in Figure 2
). Bulky hydrophobic residues such as Phe85, Tyr113, Phe120, Ile138, Leu142 and Ile162 (alignment position) in AHL synthase superfamily members are replaced by smaller hydrophobic residues such as Leu/Ile, Met/Ile, Cys/Ala, Ala/Cys, Ala and Ala/Val in 8/12-acyl AHL synthases, respectively. On the other hand, the class-specific amino acid exchanges in RhlI (that binds to a smaller substrate) relate to the accommodation of bulkier hydrophobic residues at the putative acyl-binding site (e.g. Trp114, Tyr198 and Phe199 at alignment position). RhlI recognizes a non-oxidized butyl chain as its substrate (Winson et al., 1995
), whereas the other members bind to oxidized acyl chains. Interestingly, one class-specific residue identified at alignment position 161, that is a conserved Thr in other AHL synthases but an alanine in RhlI, interacts with the 3-oxo state of the acyl chain in other members. Even amongst the ET-marked class-specific residues, RhlI exhibits anomalous residue substitutions in several alignment positions (in Table III
, for example, Ala at alignment position 120, Val at position 162). Certain class-specific mutations involve polar/charged residue substitutions in longer-acyl AHL synthases at the corresponding hydrophobic putative substrate-binding site (e.g. residues Glu and Arg in LasI at alignment positions 141 and 144).
|
Three-dimensional models were obtained for the AHL synthases such as EagI, LuxI, CarI, RhlI, TraI, LasI and the ISS using the template structures as described in Methods. The models, especially of the AHL synthases, were reasonable as reflected by high validation scores (Table IV). EagI, LuxI, CarI, RhlI, TraI and LasI are very similar in their backbone positions (average r.m.s.d. 0.80 Å). Facile docking was possible in all the cases where the acyl-PNTs and SAMs are embedded inside the V-shaped cleft such that the amine N-atom is oriented close to first carbonyl carbon of PNT acyl chain. This follows the mechanism where cyclization of SAM is driven by the nucleophile N being proximate to the C1 carbon of the phosphopantetheine moiety. A nucleophilic attack initiates the lactonization and cyclization of SAM.
|
The accesssible surface area buried as a result of docking the two small molecules, acyl-PNT and SAM, is of the order of 350 Å2 (380 for EagI and 304 for LasI). The final docked structures provide around 25 residues within interacting distance to the ligands (25 and 28 for EagI and LasI, respectively). A majority of the interacting residues are non-polar/hydrophobic (8085%) in nature. In all the homologous structures except RhlI, the bisubstrate binding modes were very similar. In RhlI, the orientation of SAM is such that the cyclic group of SAM is docked opposite to the binding pocket but the amine N-atom points to the PNT acyl chain. This difference could be due to the small size of butanoyl-PNT leading to tighter binding or due to limitations of the docking. The acyl-PNT binding mode is similar in all AHL synthase members: in the longer acyl-PNTs, the acyl chain is extended towards the open face of the groove (facing side in Figure 3) surrounded by hydrophobic residues. The greater extent of acidic residues at the acyl-PNT and substrate binding site in LasI (also picked up by the ET method) is unclear; perhaps this allows the longer-chain products to be released with lesser affinities to the protein.
|
In the form where serotonin acetyltransferase is complexed with the bisubstrate analogue, helix 1 in loop 1 (residues 4080) is longer and more rigid. A similar conformation of loop 1 is also observed in the other structural members of this superfamily bound only to AcCoA implying that this is the binding site for the acetyl donor. This region is mobile and ill-defined in the uncomplexed form of EsaI structure (Watson et al., 2002). However, these conformational changes in helix-1 upon binding could be secondary and may not correspond to the actual region of contact with SAM. Our examination of the final bisubstrate-docked models shows that residues in Regions IIV (see Figure 2
) are within interacting distance of SAM and Regions VVI are involved in interaction with acyl-PNT. SAM and acyl-PNT binding are in predominantly hydrophobic regions similar to the crystal structures of NAT members. In addition, hydrophobic patches of an amphipathic helix prior to ET-marked Region IV (7787 in EsaI; alignment position 100) are conserved amongst the AHL synthase superfamily; an equivalent helix is not present in the NAT superfamily except 1cjw. This helix could provide additional hydrophobic stabilization to the substrates, especially for the molecules with a larger number of acyl groups.
This paper reports the presence of intermediate sequences connecting two superfamilies that have similar folds. Such intermediate sequence searches (Park et al., 1997), are powerful tools in recognizing distant similarities and discovering new evolutionary pathways. There is a pull-down of ISSs using PSI-BLAST with AHL synthase sequence as a query and vice versa. This distant but clear relationship between ISSs and AHL synthases despite the closer similarity to NAT members demonstrates that ISSs are not just homologues of NATs but are connecting links between the two superfamilies. ISS also retains local similarity to AHL synthases at the putative binding site responsible for the interaction of AHL synthases with acyl-carrier proteins. The evolutionary bridge between NAT and AHL synthases could therefore suggest that the intermediate sequences perform N-acetyltransferase activity but are involved in proteinprotein interactions. These additional proteins could have a direct role (such as in the transport of the substrate or the product) or a facilitatory role in the acetylation reaction.
This paper also reports a search for sequence determinants of dramatic differences in specificity amongst homologous proteins. These include the class-specific residue exchanges observed by ET and the observation of an acidic patch in our LasI model and the anomalous behaviors of RhlI in residue substitutions. Several interesting ET-identified sites, such as invariant sites at alignment positions 38 and 115 and class-specific residues at alignment positions 85, 99, 120, 123, 137, 141, 142, 144 and 162, have not been discussed earlier. The identification of class-specific residues are especially hard to identify in the absence of an objective algorithm. A total complementarity in amino acid exchanges is not expected since the binding of these homologous proteins for their respective substrate, although high, is not extremely specific; RhlI and LasI are shown also to promote the synthesis of OHHL in small amounts. In addition, conformational changes at the backbone of homologous AHL synthases could doubtless contribute to substrate specificity amongst closely related members. The current modelling strategies are, however, insufficient to depict huge conformational differences. In spite of this limitation, several interesting residue replacements between EsaI, LasI and TraI have been identified. The class-specific residue exchanges can be starting points to design targets for mutagenesis and to develop drugs to inhibit quorum sensing.
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Cuff,J.A. and Barton,G.J. (1999) Proteins, 34, 508519.[CrossRef][ISI][Medline]
Eisenberg,D., Luthy,R. and Bowie,J.U. (1997) Methods Enzymol., 277, 396404.[ISI][Medline]
Felsenstein,J. (1985) Evolution, 39, 783791.[ISI]
Fischer,D. and Eisenberg,D. (1996) Protein Sci., 5, 947955.
Fuqua,W.C. and Greenberg,E.P. (1998) Curr. Opin. Microbiol., 1, 183189.[CrossRef][ISI][Medline]
Fuqua,W.C. and Winans,S.C. (1994) J. Bacteriol., 27962806.
Fuqua,W.C., Winans,S.C. and Greenberg,E.P. (1996) Annu. Rev. Microbiol., 50, 727751.[CrossRef][ISI][Medline]
Hanzelka,B.L., Stevens,A.M., Parsek,M.R., Cronje,T.J. and Greenberg,E.P. (1997) J. Bacteriol., 48824887.
Hickman,A.S., Namboodiri,M.A.A., Klein,D.C. and Dyda,F. (1999) Cell, 97, 361369.[ISI][Medline]
Innis,A., Shi,J. and Blundell,T.L. (2000) Proein Eng., 13, 839847.[CrossRef]
Johnson,M.S., Overington,J. and Blundell,T.L. (1993) J. Mol. Biol., 233, 735752.
Jones,D.T. (1999a) J. Mol. Biol., 292, 195202.[CrossRef][ISI][Medline]
Jones,D.T. (1999b) J. Mol. Biol., 287, 797815.[CrossRef][ISI][Medline]
Jones,S., Yu,B., Bainton,N.J., Birdsall,M., Bycroft,B.W. and Chhabra,S.R. (1993) EMBO J., 12, 24772482.[Abstract]
Katchalski-Katzir,E., Shariv,I., Eisenstein,M., Friesem,A.A., Aflalo,C. and Vakser,I.A. (1992) Proc. Natl Acad. Sci. USA, 89, 21952199.[Abstract]
Kelley,L.A., MacCallum,R.M. and Sternberg,M.J. (2000) J. Mol. Biol., 299, 499520.[ISI][Medline]
Kneller,D.G., Cohen,F.E. and Langridge,R. (1990) J. Mol. Biol., 214, 171182.[ISI][Medline]
Latifi,A., Winson,M.K., Foglino,M., Bycroft,B.W., Stewart,G.S.A.B., Lazdunski,A. and Williams,P. (1995) Mol. Microbiol., 17, 333343.[ISI][Medline]
Litcharge,O., Bourne,H.R. and Cohen,F.E. (1996) J. Mol. Biol., 257, 342358.[CrossRef][ISI][Medline]
Nicholls,A., Sharp,K.A. and Honig,B. (1991) Proteins, 11, 281296.[ISI][Medline]
Park,J., Teichmann,S.A., Hubbard,T. and Chothia,C. (1997) J. Mol. Biol., 273, 349354.[CrossRef][ISI][Medline]
Parsek,M.R., Val,D.L., Hanzelka,B.L., Cronan,J.E.,Jr and Greenberg,E.P. (1999) Proc. Natl Acad. Sci. USA, 96, 43604365.
Rost,B. and Sander,C. (1993) J. Mol. Biol., 232, 584599.[CrossRef][ISI][Medline]
Sali,A. and Blundell,T.L. (1990) J. Mol. Biol., 212, 403428.[CrossRef][ISI][Medline]
Sali,A. and Blundell,T.L. (1993) J. Mol. Biol., 234, 779815.[CrossRef][ISI][Medline]
Salmond,G.P.C., Bycroft,B.W., Stewart,G.S.A.B. and Williams,P. (1995) Mol. Microbiol., 16, 615624.[ISI][Medline]
Sutcliffe,M.J., Haneef,I., Carney,D. and Blundell,T.L. (1987a) Protein Eng., 1, 377384.[Abstract]
Sutcliffe,M.J., Hayes,F.R. and Blundell,T.L. (1987b) Protein Eng., 1, 385392.[Abstract]
Swift,S., Winson,M.K., Chan,P.F., Bainton,N.J., Birdsall,M., Reeves,P.J., Rees,C.E., Chhabra,S.R., Hill,P.J. and Throup,J.P. (1993) Mol. Microbiol., 10, 511520.[ISI][Medline]
Swift,S., Throup,J.P., Williams,P., Salmond,G.P.C. and Steward,G.S.A.B. (1996) Trends Biochem. Sci., 21, 214219.[CrossRef][ISI][Medline]
Thompson,J.D., Higgins,D.G. and Gibson,T.J. (1994) Nucleic Acids Res., 22, 46734680.[Abstract]
Vakser,I.A. (1995) Protein Eng., 8, 371377.[Abstract]
Watson,W.T., Minogue,T.D., Val,D.L., von Bodman,S.B. and Churchill,M.E.A. (2002) Mol. Cell, 9, 685694.[ISI][Medline]
Winson,M.K., Camara,M., Latifi,A., Foglino,M., Chhabra,S.R. and Daykin,M. (1995) Proc. Natl Acad. Sci. USA, 92, 94279431.[Abstract]
Received October 11, 2002; revised January 20, 2003; accepted February 5, 2003.