2 Embrapa Genetic Resources and Biotechnology, Cenargen/Embrapa, Brasilia-DF, Brazil; and 3 Universidade Católica de Brasília, Pós-Graduação em Ciências Genômicas e Biotecnologia, SGAN Quadra 916, Módulo B, Av. W5 Norte70. 790-160 Brasília-DF, Brazil
Received on April 24, 2003; revised on June 27, 2003; accepted on June 27, 2003
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key words: evolutionary relationships / fold recognition / glycosyltransferases / MurG / SpsA
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
As of June 2003, the carbohydrate active enzymes (CAZY) classification contains 65 GT families, defined on the basis of sequence similarity (Coutinho and Henrissat, 1999; Coutinho et al., 2003
). With the ongoing determinations of GTases structures (for review, see Unligil and Rini, 2000
), and computational analyses (Wrabl and Grishin, 2001
; Breton et al., 2002
) a picture has emerged of two GT superfamilies, each containing various families, which do not necessarily share significant sequence similarity. The most studied family within superfamily GT-A is family 2, which contains the inverting glycosyltransferase SpsA from Bacillus subtilis (Figure 1) (Charnock and Davies, 1999
; Tarbouriech et al., 2001
). This enzyme acts in spore coat formation, and its homologes include cellulose synthase and numerous proteins involved in bacterial cell surface glycosylation (as reviewed by Unligil and Rini, 2000
). The structure of SpsA is a single domain consisting of parallel ß-strands flanked on either side by
-helices (Figure 1) (Charnock and Davies, 1999
). Within GT-B, family 28 has received most attention, particularly the MurG protein. This enzyme is an N-acetylglucosaminyltransferase involved in the intracellular phase of bacterial peptidoglycan biosynthesis that catalyzes the transfer of N-acetyl-D-glucosamine (GlcNAc) from UDP-GlcNAc to the C4 hydroxyl group of a lipid-linked N-acetylmuramoyl pentapeptide (Ikeda et al., 1990
). The structure of MurG contains two domains, each of the Rossmann fold
/ß open sheet structure separated by a deep cleft in which the substrates bind (Figure 1) (Ha et al., 2000
, Hu et al., 2003
).
|
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
A striking conclusion is that even after comprehensive fold recognition, no folds could be assigned to 45% of GT families, containing 39% of known GT sequences. There are two alternative explanations. First, a large number of extremely divergent GT families, in fact possessing known GT folds, are present in the CAZY database, the catalytic domain folds of which are not identifiable, even after the application of advanced fold recognition tools (Fischer and Rychlewski, 2003). However, given the success of fold recognition when applied to carbohydrate active enzymes (Rigden and Franco, 2002
; Rigden, 2002
) it is perhaps more likely to suppose the existence of one or more different folds, not currently associated with GT activity. Given the importance of GTs in general (Unligil and Rini, 2000
) these would be very significant targets for structural determination.
Because fold recognition can produce significant results for structural analogs as well as distant structural homologs, we sought further evidence regarding evolutionary relationships with sensitive sequence comparisons carried out using PSI-BLAST. The results (Figure 2) provide evidence that most of the families recently identified as members of GT-A (Breton et al., 2002) (Table I) share a common evolutionary origin, although in four cases (families 16, 54, 55, and 60) no relationships could be established (Figure 2). None of the illustrated families leaves the network when only results obtained at the more conservative E-value threshold of 0.001 are considered.
|
Conservation of residues important to enzymatic activity
Current knowledge of GT mechanisms suggests the obligatory involvement of conserved catalytic acidic residues in inverting enzymes. However, as reviewed by Davies and Henrissat (2002), the mechanism of retaining GTs remains obscure. A mechanism analogous to that of retaining glycoside hydrolases would involve a conserved basic residue: Alternative mechanisms might not, although stabilization of the positively charged intermediates might be effectively carried out by acidic residues. It has also often proved difficult, particularly within superfamily GT-B, to locate these catalytic residues, even with the benefit of structural information. For example, based on the structure of the inverting enzyme MurG complexed to UDP-GlcNAc, several acidic residues were mutated, but the loss of none of these abolished catalytic activity (Hu et al., 2003
). Thus no aspartate or glutamate has so far been definitively identified as catalytic in the GT-B fold superfamily.
Using the fold recognition alignments totally (100%) and strongly (90%) conserved residues in GT families were identified and their positions compared within superfamilies (Figure 3A, 3B). In this way, tendencies in conserved acidic residue positioning were sought that might help locate the catalytic acidic residues. However, three possible complications had to be borne in mindinaccuracies in the fold recognition alignments, existence of divergent nonexpressed genome sequences, and presence of noncatalytic proteins within GT families (Unligil and Rini, 2000). It was also necessary to identify conserved acidic residues with known noncatalytic roles.
|
Another conserved Asp (numbered 39 in SpsA), located at the end of the strand ß-2 (Figure 1 and 3A), is involved in nucleotide binding, interacting with the uracyl moiety by making a hydrogen bond to N3 of uracil base (Charnock and Davies, 1999). Figure 3A demonstrates that only families 12 and 16 contain fully conserved aspartic acids in the vicinity. The remaining inverting families, with the exception of family 54, have a 90% conserved Asp and/or Glu. In family 54, 75% of sequences contain a conserved Glu residue in this same position. The retaining families 27, 45, and 55 all have conserved acidic residues for binding UDP. In contrast, families 60, 62, and 64 do not show any conservation of acidic residues at this position (Figure 3A), although some of these similarly bind UDP nucleotide compounds. Presumably other specificity-conferring mechanisms function in these families.
GT-A also contains a motif known as the DxD motif (Wiggins and Munro, 1998; Shibayama et al., 1998
; Tarbouriech et al., 2001
) (Figure 3), although the arrangement of the two Asp residues varies. In SpsA, this motif adopts the sequence xD98D99. In four representative structures available for the folding superfamily, the first aspartate residue of the motif binds to hydroxyl groups on the ribose moiety, whereas the second aspartic acid binds the divalent metal ion, which could be Mn2+ (Tarbouriech et al., 2001
). The Mn2+ ion is clearly positioned to counter the negative charge that develops on the ß-phosphate on cleavage of the donor sugarphosphate linkage (Cowan, 1998
). As shown in Figure 3A, with the exception of families 27 and 62, all other families have an at least 90% conserved acidic residue in the region. In family 27 the figure drops to 88%. Again, the families newly assigned to GT-A provide a surprise; no conserved acidic residue is seen in this region for family 62.
Our degree of understanding of mechanism in GT-B is less advanced. One particular region of sequence conservation between families of the GT-B superfamily (Ha et al., 2000) lies around 250290 (MurG numbering). Three G-loops (glycine-rich loops located at turns between the carboxyl ends of ß-strands and the N-termini of the following
-helices in Rossman fold domains; Baker et al., 1992
) have also been the focus of attention (Ha et al., 2000
). Later structural determination of the UDP-GlcNAc:MurG complex (Hu et al., 2003
) revealed that two G-loops and the 250290 conserved region are responsible for binding to the UDP-GlcNAc substrate. All inverting families had the acidic residues conserved in this region, with the exception of families 9 and 41. In family 41, 78% of sequences contained a conserved glutamine. Among the residues forming hydrogen bonds to the substrate is Glu269 (Figure 1), for which a role in distinguishing between UDP and TDP has been proposed (Hu et al., 2003
). This residue is 90% conserved in family 28 (containing MurG itself) and 100% conserved in family 30 (Figure 3B), whereas an Asp residue may functionally substitute in family 19 and 33. A catalytic role for this residue may be effectively ruled out because its replacement with Ala in MurG led to only modest loss of activity (Hu et al., 2003
). Families 5, 9, and 41 do not have any conservation of acidic residues in this region and must use other mechanisms to confer substrate specificity.
From these analyses it is clear that further work is required to help locate catalytic acidic residues in GT-B and in retaining families of GT-A. The lack of any perceptible trends in positioning of conserved acidic residues in GT-B (Figure 3B), even considering only inverting enzymes, suggests that several catalytic site architectures may well be present in the superfamily.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Iterated sequence database searches were carried out using PSI-BLAST (Altschul et al., 1997) at the NCBI (www.ncbi.nlm.nih.gov/BLAST), using either 0.01 or 0.001 as the E-value cut-off, below which a sequence is included in the next iteration. Appearance of a member of a given GT family in the list of sequences resulting from a search using a different GT family was taken to indicate significant sequence similarity and hence to support a common evolutionary origin for the two families. As input for the fold recognition and iterated database searches we used representative sequences of all families as listed in Table I. When searches with a certain GT family member produced members of another GT family among the significant results, a possible evolutionary origin for the two families was suggested.
![]() |
Footnotes |
---|
![]() |
Abbreviations |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Baker, P.J., Britton, K.L., Rice, D.W., Rob, A., and Stillman, T.J. (1992) Structural consequences of sequences patterns in the fingerprint region of the nucleotide binding fold: implications for nucleotide specificity. J. Mol. Biol., 228, 662671.[ISI][Medline]
Breton, C., Bettler, E., Joziasse, D.H., Geremia, R.A., and Imberty, A. (1998) Sequence-function relationship of prokaryotic and eukaryotic galactosyltransferases. J. Biochem. (Tokyo), 123, 10001009.[Abstract]
Breton, C., Heissigerova, H., Jeanneau, C., Moravcova, J., and Imberty, A. (2002) Comparative aspects of glycosyltransferases. Biochem. Soc. Symp., 69, 2332.[Medline]
Brown, N.P., Leroy, C., and Sander, C. (1998) MView: a Web compatible database search or multiple alignment viewer. Bioinformatics, 14, 380381.[Abstract]
Bujnicki, J.M., Elofsson, A., Fischer, D., and Rychlewski, L. (2001a) Structure prediction meta server. Bioinformatics, 17, 75007511.
Bujnicki, J.M., Elofsson, A., Fischer, D., and Rychlewski L. (2001b) LiveBench-1: continuous benchmarking of protein structure prediction servers. Protein Sci. 10, 352361.
Campbell, J.A., Davies, G.J., Bulone, V., and Henrissat, B. (1998) A classification of nucleotide-diphospho-sugar glycosyltransferases based on amino acid sequence similarities. Biochem. J., 329, 719.[ISI][Medline]
Charnock, S.J. and Davies, G.J. (1999) Structure of the nucleotide-diphospho-sugar transferase, SpsA from Bacillus subtilis, in native and nucleotide-complexed forms. Biochemistry, 38, 63806385.[CrossRef][ISI][Medline]
Colonna-Romano, S., Porta, A., Franco, A., Kobayashi, G.S., and Maresca, B. (1998) Identification and isolation by DDRT-PCR of genes differentially expressed by Histoplasma capsulatum during macrophages infection. Microb. Pathog., 25, 5566.[CrossRef][ISI][Medline]
Coutinho, P.M. and Henrissat, B. (1999) Carbohydrate-active enzymes server. Available at http://afmb.cnrs-mrs.fr/CAZY. Accessed August 20, 2003.
Coutinho, P.M., Deleury, E., Davies, G.J., and Henrissat, B. (2003) An evolving hierarchical family classification for glycosyltransferases. J. Mol. Biol., 328, 307317.[CrossRef][ISI][Medline]
Cowan, J.A. (1998) Magnesium activation of nuclease enzymesthe importance of water. Inorg. Chim. Acta, 275, 2427.[CrossRef]
Davies, G.J. and Henrissat, B. (2002) Plant glyco-related genomics. Structural enzymology of carbohydrate-active enzymes: implications for the post-genomic era. Biochem. Soc. Trans., 30, 291297.[CrossRef][ISI][Medline]
Fischer, D. (2000) Hybrid fold recognition: combining sequence derived properties with evolutionary information. In Altman, R.B., Dunker, A.K., Hunter, L., Laudardale, K., and Klein, T.E. (Eds.), Pacific Symposium on Biocomputing. World Scientific, Singapore, pp. 119130.
Fischer, D. (2003) 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor. Proteins, 51, 434441.[CrossRef][ISI][Medline]
Fischer, D. and Rychlewski, L. (2003) The 2002 Olympic games of protein structure prediction. Protein. Eng., 16, 157160.
Garinot-Schneider, C., Lellouch, A.C., and Geremia, R.A. (2000) Identification of essential amino-acid residues in the Sinorhizobium meliloti glucosyltransferase ExoM. J. Biol. Chem., 275, 3140731423.
Gastinel, L.N., Cambilau, C., and Bourne, Y. (1999) Crystal structures of the bovine ß4-galactosyltransferase catalytic domain and its complex with uridine diphosphogalactose. EMBO J., 18, 35463557.
Guex, N. and Peitsch, M.C. (1997) Swiss-model and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis, 18, 27142723.[ISI][Medline]
Ha, S., Walker, D., Shi, Y., and Walker, S. (2000) The 1.9 Å crystal structure of Escherichia coli MurG, a membrane-associated glycosyltransferase involved in peptidoglycan biosynthesis. Protein. Sci., 9, 10451052.[Abstract]
Higgins, D., Thompson, J., Gibson, T., Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 46734680.[Abstract]
Hu, Y., Chen, L., Ha, S., Gross, B., Falcone, B., Walker, D., Mokhtarzadeh, M., and Walker, S. (2003) Crystal structure of the MurG:UDP-GlcNAc complex reveals common structural principles of a superfamily of glycosyltransferases. Proc. Natl Acad. Sci. USA, 100, 845849.
Ikeda, M., Wachi, M., Jung, H.K., Ishino, F., and Matsuhashi, M. (1990) Nucleotide sequence involving murG and murC in the mra gene cluster region of Escherichia coli. Nucleic Acids Res., 18, 40144014.[ISI][Medline]
Keenleyside, W.J., Clarke, A.J., and Whitfield, C. (2001) Identification of residues involved in catalytic activity of the inverting glycosyl transferase WbbE from Salmonella enterica serovar borreze. J. Bacteriol., 183, 7785.
Kelley, L.A., MacCallum, R.M., and Sternberg, M.J. (2000) Enhanced genome annotation using structural profiles in the program 3D-PSSM. J. Mol. Biol., 299, 499520[ISI][Medline]
Lundstrom, J., Rychlewski, L., Bujnicki, J., and Elofsson, A. (2001) Pcons: a neural-network-based consensus predictor that improves fold recognition. Protein Sci., 10, 23542362.
Parkhill, J., Wren, B.W., Thomson, N.R., Titball, R.W., Holden, M.T.G., Prentice, M.B., Sebaihia, M., James, K.D., Churcher, C., Mungall, K.L., and others. (2001) Genome sequence of Yersinia pestis, the causative agent of plague. Nature, 413, 523527.[CrossRef][ISI][Medline]
Pedersen, L.C., Tsuchida, K., Kitagawa, H., Sugahara, K., Darden, T.A., and Negishi, M. (2000) Heparan/chondoitin sulfate biosynthesis: structure and mechanism of human glucuronyltransferase I. J. Biol. Chem., 275, 3458034585.
Rigden, D.J. (2002) Iterative database searches demonstrate that glycoside hydrolase families 27, 31, 36 and 66 share a common evolutionary origin with family 13. FEBS Lett., 523, 1722.[CrossRef][ISI][Medline]
Rigden, D.J. and Franco, O.L. (2002) Beta-helical catalytic domains in glycoside hydrolase families 49, 55 and 87: domain architecture, modelling and assignment of catalytic residues. FEBS Lett., 530, 22532.[CrossRef][ISI][Medline]
Rychlewski, L., Jaroszewski, L., Li, W., and Godzik, A. (2000) Comparison of sequence profiles: strategies for structural predictions using sequence information. Protein Sci., 9, 232241.[Abstract]
Shibayama, K., Ohsuka, S., Tanaka, T., Arakawa, Y., and Ohta, M. (1998) Conserved structural regions involved in the catalytic mechanism of Escherichia coli K-12 WaaO(Rfal). J. Bacteriol., 180, 53135318.
Tarbouriech, N., Charnock, S.J., and Davies, G.J. (2001) Three-dimensional structures of the Mn and Mg dTDP complexes of the family GT-2 glycosyltransferase SpsA: a comparison with related NDP-sugar glycosyltransferases. J. Mol. Biol., 314, 655661.[CrossRef][ISI][Medline]
Theologis, A., Ecker, J.R., Palm, C.J., Federspiel, N.A., Kaul, S., White, O., Alonso, J., Altafi, H., Araujo, R., Bowman, C.L., and others. (2000) Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana. Nature, 408, 816820.[CrossRef][ISI][Medline]
Unligil, U.M. and Rini, J.M. (2000) Glycosyltransferase structure and mechanism. Curr. Opin. Struct. Biol., 10, 510517.[CrossRef][ISI][Medline]
Unligil, U.M., Zhou, S., Yuwaraj, S., Sarkar, M., Schachter, H., and Rini, J.M. (2000) X-ray crystal structure of rabbit N-acetylglucosaminyltransferase I: enzyme mechanism and a new protein superfamily. EMBO J., 19, 52695280.
Verbert, A. and Cacan, R. (1999) "Glyco-deglyco" processes during the biosynthesis of glycoproteins. J. Soc. Biol., 193, 101110.[Medline]
Wiggins, C.A. and Munro, S. (1998) Activity of the yeast MNN1 -1,3-mannosyltransferase requires a motif conserved in many other families of glycosyltransferases. Proc. Natl Acad. Sci. USA, 95, 79457950.
Wrabl, J.O. and Grishin, N.V. (2001) Homology between O-linked GlcNAc transferases and proteins of the glycogen phosphorylase superfamily. J. Mol. Biol., 314, 365374.[CrossRef][ISI][Medline]