Fold recognition analysis of glycosyltransferase families: further members of structural superfamilies

Octávio L. Franco1,2,3 and Daniel J. Rigden2

2 Embrapa Genetic Resources and Biotechnology, Cenargen/Embrapa, Brasilia-DF, Brazil; and 3 Universidade Católica de Brasília, Pós-Graduação em Ciências Genômicas e Biotecnologia, SGAN Quadra 916, Módulo B, Av. W5 Norte70. 790-160 Brasília-DF, Brazil

Received on April 24, 2003; revised on June 27, 2003; accepted on June 27, 2003


    Abstract
 Top
 Abstract
 Introduction
 Results and discussion
 Materials and methods
 References
 
Glycosyltransferases (GTs) are diverse enzymes organized into 65 families. X-ray crystallography and in silico studies have shown many of these to belong to two structural superfamilies: GT-A and GT-B. Through application of fold recognition and iterated sequence searches, we demonstrate that families 60, 62, and 64 may also be grouped into the GT-A fold superfamily. Analysis of conserved acidic residues suggests that catalytic sites are better conserved in superfamily GT-B than in GT-A. Although 26% and 29% of GT families may now be confidently placed in superfamilies GT-A and GT-B, respectively, the remaining 45% of families bear no discernible resemblance to either superfamily, which, given the sensitivity of modern fold recognition methods, suggests the existence of novel structural scaffolds associated with GT activity. Furthermore, bioinformatics studies indicate the apparent ease with which mechanism—inverting or retaining—may change during evolution.

Key words: evolutionary relationships / fold recognition / glycosyltransferases / MurG / SpsA


    Introduction
 Top
 Abstract
 Introduction
 Results and discussion
 Materials and methods
 References
 
Glycosyltransferases (GTs; EC 2.4.–.–) constitute a large group of enzymes that are involved in the biosynthesis of oligosaccharides and polysaccharides and that act through the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. Particularly abundant are a group of enzymes, present in both prokaryotes and eukaryotes, that utilize an activated nucleotide sugar as a donor and plays significant roles in important biological processes (Verbert and Cacan, 1999Go).

As of June 2003, the carbohydrate active enzymes (CAZY) classification contains 65 GT families, defined on the basis of sequence similarity (Coutinho and Henrissat, 1999Go; Coutinho et al., 2003Go). With the ongoing determinations of GTases structures (for review, see Unligil and Rini, 2000Go), and computational analyses (Wrabl and Grishin, 2001Go; Breton et al., 2002Go) a picture has emerged of two GT superfamilies, each containing various families, which do not necessarily share significant sequence similarity. The most studied family within superfamily GT-A is family 2, which contains the inverting glycosyltransferase SpsA from Bacillus subtilis (Figure 1) (Charnock and Davies, 1999Go; Tarbouriech et al., 2001Go). This enzyme acts in spore coat formation, and its homologes include cellulose synthase and numerous proteins involved in bacterial cell surface glycosylation (as reviewed by Unligil and Rini, 2000Go). The structure of SpsA is a single domain consisting of parallel ß-strands flanked on either side by {alpha}-helices (Figure 1) (Charnock and Davies, 1999Go). Within GT-B, family 28 has received most attention, particularly the MurG protein. This enzyme is an N-acetylglucosaminyltransferase involved in the intracellular phase of bacterial peptidoglycan biosynthesis that catalyzes the transfer of N-acetyl-D-glucosamine (GlcNAc) from UDP-GlcNAc to the C4 hydroxyl group of a lipid-linked N-acetylmuramoyl pentapeptide (Ikeda et al., 1990Go). The structure of MurG contains two domains, each of the Rossmann fold {alpha} open sheet structure separated by a deep cleft in which the substrates bind (Figure 1) (Ha et al., 2000Go, Hu et al., 2003Go).



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 1. Ribbon diagrams of SpsA (PDB code 1QG8) from B. subtilis representing GT-A superfamily and MurG from E. coli complexed to UDP-GlcNAc (PDB code 1NLM) representing the GT-B superfamily. Residues important for substrate binding, specificity, ion binding, and catalysis are shown in black. In SpsA the manganese ion is shown as a dark gray sphere and in MurG substrate is shown in light gray. The figure was made using SPDBViewer 3.7 program (Guex and Peitsch, 1997Go).

 
Here we present the results of an investigation by fold recognition of GT families of still unknown catalytic domain architecture by which families 60, 62, and 64 were grouped to the GT-A fold superfamily. The failure to assign folds to catalytic domains in the remaining families, despite the application of a battery of modern fold recognition methods, suggests that, contrary to the general supposition, other folds are likely to be associated with GT activity. Surprisingly, just a single iteration of PSI-BLAST is required to demonstrate homology between inverting and retaining GT families.


    Results and discussion
 Top
 Abstract
 Introduction
 Results and discussion
 Materials and methods
 References
 
Fold recognition analysis and evolutionary relationships by PSI-BLAST
Representative sequences (see Table I) from GT families for which the catalytic domain structure was not known were submitted for analysis at the Meta-Server (Bujnicki et al., 2001aGo). Although this provides a convenient access to many different methods, we focused on the results of two consensus fold recognition methods—Pcons2 (Lundstrom et al., 2001Go) and Shotgun on 3 (Fischer, 2003Go)—that distinguish true and false positives more effectively than individual methods. For comparison we also monitored the results of the 3D-PSSM method, one of the best-performing individual methods (Kelley et al., 2000Go). Results are also summarized in Table I. To estimate confidence in the results we compared scores with the worst (highest-scoring) false positives produced by each individual method (http://bioinfo.pl/LiveBench; Bujnicki et al., 2001bGo). Currently, the worst three false positives of the Pcons2 method score 2.42, 1.87 and 1.13. For Shotgun on 3 the figures are 31.4, 22.6, and 21.5, whereas for 3D-PSSM, where the most confident hits have the lowest values, they are 0.037, 0.28, and 0.30. The fact that the top-scoring structures for the GT families presented in Table I were also GT enzymes, along with the unanimity of the predictions, provided further support for the correctness of the assignment, particularly in view of the current perception that most GT families have one of the two known GT-associated folds.


View this table:
[in this window]
[in a new window]
 
Table I. Families of glycosyltransferases mentioned in the text and belonging to the GT-A superfamily

 
Fold recognition analyses showed that families 12, 16, 21, 25, 27, 45, 54, 55, 60, 62, and 64 matched the structure of SpsA (PDB: 1QG8), a GT from family 2, which belongs to superfamily GT-A. Relations between 2 and 12, 16, 21, 25, 27, 45, 54, and 55 have been previously noted (Breton et al., 2002Go; CAZY home page, http://afmb.cnrs-mrs.fr/CAZY). Thus in this study we have added three more families to superfamily GT-A (Table I). No further families could be allocated to GT-B other than those already assigned (Wrabl and Grishin, 2001Go).

A striking conclusion is that even after comprehensive fold recognition, no folds could be assigned to 45% of GT families, containing 39% of known GT sequences. There are two alternative explanations. First, a large number of extremely divergent GT families, in fact possessing known GT folds, are present in the CAZY database, the catalytic domain folds of which are not identifiable, even after the application of advanced fold recognition tools (Fischer and Rychlewski, 2003Go). However, given the success of fold recognition when applied to carbohydrate active enzymes (Rigden and Franco, 2002Go; Rigden, 2002Go) it is perhaps more likely to suppose the existence of one or more different folds, not currently associated with GT activity. Given the importance of GTs in general (Unligil and Rini, 2000Go) these would be very significant targets for structural determination.

Because fold recognition can produce significant results for structural analogs as well as distant structural homologs, we sought further evidence regarding evolutionary relationships with sensitive sequence comparisons carried out using PSI-BLAST. The results (Figure 2) provide evidence that most of the families recently identified as members of GT-A (Breton et al., 2002Go) (Table I) share a common evolutionary origin, although in four cases (families 16, 54, 55, and 60) no relationships could be established (Figure 2). None of the illustrated families leaves the network when only results obtained at the more conservative E-value threshold of 0.001 are considered.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 2. Schematic representation of the GT family relationships revealed by PSI-BLAST (Altschul et al., 1997Go). The presence of family B in the significant results of searches made using family A is represented by an arrow from family A to family B. Each arrow is associated with the number of iterations required to demonstrate the relationship (x/y) at E-value thresholds of 0.01 (x) or 0.001 (y). Squares represent inverting enzymes and ovals represent retaining enzymes. The enzymatic mechanism of family 60, without a frame, is not known. Gray is used for newly demonstrated relationships (Table I) and black for those previously known (Breton et al., 2002Go).

 
Another important question relates to the evolution of the enzymatic mechanism—with what ease can inverting enzymes evolve into retaining ones, and vice versa? As shown in Figure 2, only a single round of PSI-BLAST was sufficient to demonstrate a relationship between family 2 (inverting) and family 45 (retaining). Similarly, within superfamily GT-B, just two iterations of PSI-BLAST at 0.001 are necessary to demonstrate a relationship between retaining family 5 and inverting family 19 (Wrabl and Grishin, 2001Go; data not shown). These data also suggest that the transition from inverting to retaining mechanism, or the reverse, occurs relatively easily, so that multiple transitions have likely happened in each superfamily. These close relationships between retaining and inverting GTs have also been seen in other studies (Campbell et al., 1998Go; Breton et al., 1998Go).

Conservation of residues important to enzymatic activity
Current knowledge of GT mechanisms suggests the obligatory involvement of conserved catalytic acidic residues in inverting enzymes. However, as reviewed by Davies and Henrissat (2002)Go, the mechanism of retaining GTs remains obscure. A mechanism analogous to that of retaining glycoside hydrolases would involve a conserved basic residue: Alternative mechanisms might not, although stabilization of the positively charged intermediates might be effectively carried out by acidic residues. It has also often proved difficult, particularly within superfamily GT-B, to locate these catalytic residues, even with the benefit of structural information. For example, based on the structure of the inverting enzyme MurG complexed to UDP-GlcNAc, several acidic residues were mutated, but the loss of none of these abolished catalytic activity (Hu et al., 2003Go). Thus no aspartate or glutamate has so far been definitively identified as catalytic in the GT-B fold superfamily.

Using the fold recognition alignments totally (100%) and strongly (90%) conserved residues in GT families were identified and their positions compared within superfamilies (Figure 3A, 3B). In this way, tendencies in conserved acidic residue positioning were sought that might help locate the catalytic acidic residues. However, three possible complications had to be borne in mind—inaccuracies in the fold recognition alignments, existence of divergent nonexpressed genome sequences, and presence of noncatalytic proteins within GT families (Unligil and Rini, 2000Go). It was also necessary to identify conserved acidic residues with known noncatalytic roles.




View larger version (79K):
[in this window]
[in a new window]
 
Fig. 3. Conserved acidic residues plot of GT families with a GT-A (A) and GT-B (B) fold. Closed triangle, 100% conservation of an aspartic acid; open triangle, 90% conservation of an aspartic acid; closed circle, 100% conservation of a glutamic acid; open circle, 90% conservation of a glutamic acid; closed squares, 100% conservation of two aspartic acid; open squares, 90% conservation of two aspartic acids. The sequences were aligned using ClustalW (Higgins et al., 1994Go) and conserved residues were identified using Mview program (Brown et al., 1998Go). Dotted lines represents the residues related to M2+ binding site, UDP binding, and the probable catalytic site of GT-A fold families (Unligil and Rini, 2000Go). Light gray zones represents the G-loops, and dark gray zone indicates a highly conserved region that could be involved in the catalytic mechanism of GT-B fold families (Ha et al., 2000Go).

 
Within the GT-A family there is a strong candidate for catalytic residue—Asp191 (SpsA numbering; Tarbouriech et al., 2001Go). A similar residue in the same position was previously described in structures of families GT-7 (Gastinel et al., 1999Go) and GT-13 (Unligil et al., 2000Go). In all the families studied for which a relationship with family 2 was observed, a suitable located Asp or Glu (allowing for local errors in alignment) is present (Figure 3A). The presence of a catalytic Glu in this same region has a precedent in the form of family 43 (Pedersen et al., 2000Go). Several site-directed mutagenesis studies support a key role of these acidic residues in the enzymatic mechanism. When these residues were mutated in glycosyltransferases from Sinorhizobium meliloti (Garinot-Schneider et al. 2000Go) and from Salmonella enterica (Keenleyside et al., 2001Go), the enzymatic activity was severely decreased. Most interestingly, similarly placed highly conserved acidic residues are seen in the retaining families and in family 60 whose nature—retaining or inverting—is unknown. This may be seen as support for the double displacement mechanism of retention of configuration of mechanism, analogous to that observed in glycoside hydrolases, which would necessarily involve the participation of a basic residue, most likely Asp or Glu (Davies and Henrissat, 2002Go). However the positively charged intermediates and transition states that would be involved in alternative mechanisms (Davies and Henrissat, 2002Go) would likely require charge compensation as a means toward their stabilization. Asp and Glu residues could also fulfill that role.

Another conserved Asp (numbered 39 in SpsA), located at the end of the strand ß-2 (Figure 1 and 3A), is involved in nucleotide binding, interacting with the uracyl moiety by making a hydrogen bond to N3 of uracil base (Charnock and Davies, 1999Go). Figure 3A demonstrates that only families 12 and 16 contain fully conserved aspartic acids in the vicinity. The remaining inverting families, with the exception of family 54, have a 90% conserved Asp and/or Glu. In family 54, 75% of sequences contain a conserved Glu residue in this same position. The retaining families 27, 45, and 55 all have conserved acidic residues for binding UDP. In contrast, families 60, 62, and 64 do not show any conservation of acidic residues at this position (Figure 3A), although some of these similarly bind UDP nucleotide compounds. Presumably other specificity-conferring mechanisms function in these families.

GT-A also contains a motif known as the DxD motif (Wiggins and Munro, 1998Go; Shibayama et al., 1998Go; Tarbouriech et al., 2001Go) (Figure 3), although the arrangement of the two Asp residues varies. In SpsA, this motif adopts the sequence xD98D99. In four representative structures available for the folding superfamily, the first aspartate residue of the motif binds to hydroxyl groups on the ribose moiety, whereas the second aspartic acid binds the divalent metal ion, which could be Mn2+ (Tarbouriech et al., 2001Go). The Mn2+ ion is clearly positioned to counter the negative charge that develops on the ß-phosphate on cleavage of the donor sugar–phosphate linkage (Cowan, 1998Go). As shown in Figure 3A, with the exception of families 27 and 62, all other families have an at least 90% conserved acidic residue in the region. In family 27 the figure drops to 88%. Again, the families newly assigned to GT-A provide a surprise; no conserved acidic residue is seen in this region for family 62.

Our degree of understanding of mechanism in GT-B is less advanced. One particular region of sequence conservation between families of the GT-B superfamily (Ha et al., 2000Go) lies around 250–290 (MurG numbering). Three G-loops (glycine-rich loops located at turns between the carboxyl ends of ß-strands and the N-termini of the following {alpha}-helices in Rossman fold domains; Baker et al., 1992Go) have also been the focus of attention (Ha et al., 2000Go). Later structural determination of the UDP-GlcNAc:MurG complex (Hu et al., 2003Go) revealed that two G-loops and the 250–290 conserved region are responsible for binding to the UDP-GlcNAc substrate. All inverting families had the acidic residues conserved in this region, with the exception of families 9 and 41. In family 41, 78% of sequences contained a conserved glutamine. Among the residues forming hydrogen bonds to the substrate is Glu269 (Figure 1), for which a role in distinguishing between UDP and TDP has been proposed (Hu et al., 2003Go). This residue is 90% conserved in family 28 (containing MurG itself) and 100% conserved in family 30 (Figure 3B), whereas an Asp residue may functionally substitute in family 19 and 33. A catalytic role for this residue may be effectively ruled out because its replacement with Ala in MurG led to only modest loss of activity (Hu et al., 2003Go). Families 5, 9, and 41 do not have any conservation of acidic residues in this region and must use other mechanisms to confer substrate specificity.

From these analyses it is clear that further work is required to help locate catalytic acidic residues in GT-B and in retaining families of GT-A. The lack of any perceptible trends in positioning of conserved acidic residues in GT-B (Figure 3B), even considering only inverting enzymes, suggests that several catalytic site architectures may well be present in the superfamily.


    Materials and methods
 Top
 Abstract
 Introduction
 Results and discussion
 Materials and methods
 References
 
Members of GT families 2, 12, 16, 21, 25, 27, 28, 45, 54, 55, 56, 60, 62, and 64 were located in the CAZY database and retrieved using Entrez (www.ncbi.nlm.nih.gov/entrez). Groups of sequences were aligned with ClustalW (Higgins et al., 1994Go). Manipulation and limited hand-editing of alignments were performed with Jalview (www.ebi.ac.uk/~michele/jalview). Residues with 90% or 100% of conservation was located using the MView program (Brown et al., 1998Go). Fold recognition experiments made use of the Structure Prediction META server (Bujnicki et al., 2001aGo). Particular attention was paid to the results of the two consensus fold recognition analyses, Pcons2 (Lundstrom et al., 2001Go) and Shotgun on 3 consensus prediction (Fischer, 2003Go), which produces a score based on the results of three independent fold recognition methods, FFAS (Rychlewski et al., 2000Go), 3D-PSSM (Kelley et al., 2000Go, and Inbgu (Fischer, 2000Go).

Iterated sequence database searches were carried out using PSI-BLAST (Altschul et al., 1997Go) at the NCBI (www.ncbi.nlm.nih.gov/BLAST), using either 0.01 or 0.001 as the E-value cut-off, below which a sequence is included in the next iteration. Appearance of a member of a given GT family in the list of sequences resulting from a search using a different GT family was taken to indicate significant sequence similarity and hence to support a common evolutionary origin for the two families. As input for the fold recognition and iterated database searches we used representative sequences of all families as listed in Table I. When searches with a certain GT family member produced members of another GT family among the significant results, a possible evolutionary origin for the two families was suggested.


    Footnotes
 
1 To whom correspondence should be addressed; e-mail: ocfranco{at}cenargen.embrapa.br Back


    Abbreviations
 
CAZY, carbohydrate active enzymes; GT, glycosyltransferases; GT-A, glycosyltransferase from fold superfamily A; GT-B, glycosyltransferase from fold superfamily B; MurG, GT-B from E. coli; SpsA, GT-A from B. subtilis


    References
 Top
 Abstract
 Introduction
 Results and discussion
 Materials and methods
 References
 
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402.[Abstract/Free Full Text]

Baker, P.J., Britton, K.L., Rice, D.W., Rob, A., and Stillman, T.J. (1992) Structural consequences of sequences patterns in the fingerprint region of the nucleotide binding fold: implications for nucleotide specificity. J. Mol. Biol., 228, 662–671.[ISI][Medline]

Breton, C., Bettler, E., Joziasse, D.H., Geremia, R.A., and Imberty, A. (1998) Sequence-function relationship of prokaryotic and eukaryotic galactosyltransferases. J. Biochem. (Tokyo), 123, 1000–1009.[Abstract]

Breton, C., Heissigerova, H., Jeanneau, C., Moravcova, J., and Imberty, A. (2002) Comparative aspects of glycosyltransferases. Biochem. Soc. Symp., 69, 23–32.[Medline]

Brown, N.P., Leroy, C., and Sander, C. (1998) MView: a Web compatible database search or multiple alignment viewer. Bioinformatics, 14, 380–381.[Abstract]

Bujnicki, J.M., Elofsson, A., Fischer, D., and Rychlewski, L. (2001a) Structure prediction meta server. Bioinformatics, 17, 7500–7511.

Bujnicki, J.M., Elofsson, A., Fischer, D., and Rychlewski L. (2001b) LiveBench-1: continuous benchmarking of protein structure prediction servers. Protein Sci. 10, 352–361.[Abstract/Free Full Text]

Campbell, J.A., Davies, G.J., Bulone, V., and Henrissat, B. (1998) A classification of nucleotide-diphospho-sugar glycosyltransferases based on amino acid sequence similarities. Biochem. J., 329, 719.[ISI][Medline]

Charnock, S.J. and Davies, G.J. (1999) Structure of the nucleotide-diphospho-sugar transferase, SpsA from Bacillus subtilis, in native and nucleotide-complexed forms. Biochemistry, 38, 6380–6385.[CrossRef][ISI][Medline]

Colonna-Romano, S., Porta, A., Franco, A., Kobayashi, G.S., and Maresca, B. (1998) Identification and isolation by DDRT-PCR of genes differentially expressed by Histoplasma capsulatum during macrophages infection. Microb. Pathog., 25, 55–66.[CrossRef][ISI][Medline]

Coutinho, P.M. and Henrissat, B. (1999) Carbohydrate-active enzymes server. Available at http://afmb.cnrs-mrs.fr/CAZY. Accessed August 20, 2003.

Coutinho, P.M., Deleury, E., Davies, G.J., and Henrissat, B. (2003) An evolving hierarchical family classification for glycosyltransferases. J. Mol. Biol., 328, 307–317.[CrossRef][ISI][Medline]

Cowan, J.A. (1998) Magnesium activation of nuclease enzymes—the importance of water. Inorg. Chim. Acta, 275, 24–27.[CrossRef]

Davies, G.J. and Henrissat, B. (2002) Plant glyco-related genomics. Structural enzymology of carbohydrate-active enzymes: implications for the post-genomic era. Biochem. Soc. Trans., 30, 291–297.[CrossRef][ISI][Medline]

Fischer, D. (2000) Hybrid fold recognition: combining sequence derived properties with evolutionary information. In Altman, R.B., Dunker, A.K., Hunter, L., Laudardale, K., and Klein, T.E. (Eds.), Pacific Symposium on Biocomputing. World Scientific, Singapore, pp. 119–130.

Fischer, D. (2003) 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor. Proteins, 51, 434–441.[CrossRef][ISI][Medline]

Fischer, D. and Rychlewski, L. (2003) The 2002 Olympic games of protein structure prediction. Protein. Eng., 16, 157–160.[Abstract/Free Full Text]

Garinot-Schneider, C., Lellouch, A.C., and Geremia, R.A. (2000) Identification of essential amino-acid residues in the Sinorhizobium meliloti glucosyltransferase ExoM. J. Biol. Chem., 275, 31407–31423.[Abstract/Free Full Text]

Gastinel, L.N., Cambilau, C., and Bourne, Y. (1999) Crystal structures of the bovine ß4-galactosyltransferase catalytic domain and its complex with uridine diphosphogalactose. EMBO J., 18, 3546–3557.[Abstract/Free Full Text]

Guex, N. and Peitsch, M.C. (1997) Swiss-model and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis, 18, 2714–2723.[ISI][Medline]

Ha, S., Walker, D., Shi, Y., and Walker, S. (2000) The 1.9 Å crystal structure of Escherichia coli MurG, a membrane-associated glycosyltransferase involved in peptidoglycan biosynthesis. Protein. Sci., 9, 1045–1052.[Abstract]

Higgins, D., Thompson, J., Gibson, T., Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673–4680.[Abstract]

Hu, Y., Chen, L., Ha, S., Gross, B., Falcone, B., Walker, D., Mokhtarzadeh, M., and Walker, S. (2003) Crystal structure of the MurG:UDP-GlcNAc complex reveals common structural principles of a superfamily of glycosyltransferases. Proc. Natl Acad. Sci. USA, 100, 845–849.[Abstract/Free Full Text]

Ikeda, M., Wachi, M., Jung, H.K., Ishino, F., and Matsuhashi, M. (1990) Nucleotide sequence involving murG and murC in the mra gene cluster region of Escherichia coli. Nucleic Acids Res., 18, 4014–4014.[ISI][Medline]

Keenleyside, W.J., Clarke, A.J., and Whitfield, C. (2001) Identification of residues involved in catalytic activity of the inverting glycosyl transferase WbbE from Salmonella enterica serovar borreze. J. Bacteriol., 183, 77–85.[Abstract/Free Full Text]

Kelley, L.A., MacCallum, R.M., and Sternberg, M.J. (2000) Enhanced genome annotation using structural profiles in the program 3D-PSSM. J. Mol. Biol., 299, 499–520[ISI][Medline]

Lundstrom, J., Rychlewski, L., Bujnicki, J., and Elofsson, A. (2001) Pcons: a neural-network-based consensus predictor that improves fold recognition. Protein Sci., 10, 2354–2362.[Abstract/Free Full Text]

Parkhill, J., Wren, B.W., Thomson, N.R., Titball, R.W., Holden, M.T.G., Prentice, M.B., Sebaihia, M., James, K.D., Churcher, C., Mungall, K.L., and others. (2001) Genome sequence of Yersinia pestis, the causative agent of plague. Nature, 413, 523–527.[CrossRef][ISI][Medline]

Pedersen, L.C., Tsuchida, K., Kitagawa, H., Sugahara, K., Darden, T.A., and Negishi, M. (2000) Heparan/chondoitin sulfate biosynthesis: structure and mechanism of human glucuronyltransferase I. J. Biol. Chem., 275, 34580–34585.[Abstract/Free Full Text]

Rigden, D.J. (2002) Iterative database searches demonstrate that glycoside hydrolase families 27, 31, 36 and 66 share a common evolutionary origin with family 13. FEBS Lett., 523, 17–22.[CrossRef][ISI][Medline]

Rigden, D.J. and Franco, O.L. (2002) Beta-helical catalytic domains in glycoside hydrolase families 49, 55 and 87: domain architecture, modelling and assignment of catalytic residues. FEBS Lett., 530, 225–32.[CrossRef][ISI][Medline]

Rychlewski, L., Jaroszewski, L., Li, W., and Godzik, A. (2000) Comparison of sequence profiles: strategies for structural predictions using sequence information. Protein Sci., 9, 232–241.[Abstract]

Shibayama, K., Ohsuka, S., Tanaka, T., Arakawa, Y., and Ohta, M. (1998) Conserved structural regions involved in the catalytic mechanism of Escherichia coli K-12 WaaO(Rfal). J. Bacteriol., 180, 5313–5318.[Abstract/Free Full Text]

Tarbouriech, N., Charnock, S.J., and Davies, G.J. (2001) Three-dimensional structures of the Mn and Mg dTDP complexes of the family GT-2 glycosyltransferase SpsA: a comparison with related NDP-sugar glycosyltransferases. J. Mol. Biol., 314, 655–661.[CrossRef][ISI][Medline]

Theologis, A., Ecker, J.R., Palm, C.J., Federspiel, N.A., Kaul, S., White, O., Alonso, J., Altafi, H., Araujo, R., Bowman, C.L., and others. (2000) Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana. Nature, 408, 816–820.[CrossRef][ISI][Medline]

Unligil, U.M. and Rini, J.M. (2000) Glycosyltransferase structure and mechanism. Curr. Opin. Struct. Biol., 10, 510–517.[CrossRef][ISI][Medline]

Unligil, U.M., Zhou, S., Yuwaraj, S., Sarkar, M., Schachter, H., and Rini, J.M. (2000) X-ray crystal structure of rabbit N-acetylglucosaminyltransferase I: enzyme mechanism and a new protein superfamily. EMBO J., 19, 5269–5280.[Abstract/Free Full Text]

Verbert, A. and Cacan, R. (1999) "Glyco-deglyco" processes during the biosynthesis of glycoproteins. J. Soc. Biol., 193, 101–110.[Medline]

Wiggins, C.A. and Munro, S. (1998) Activity of the yeast MNN1 {alpha}-1,3-mannosyltransferase requires a motif conserved in many other families of glycosyltransferases. Proc. Natl Acad. Sci. USA, 95, 7945–7950.[Abstract/Free Full Text]

Wrabl, J.O. and Grishin, N.V. (2001) Homology between O-linked GlcNAc transferases and proteins of the glycogen phosphorylase superfamily. J. Mol. Biol., 314, 365–374.[CrossRef][ISI][Medline]