Evolution of {alpha}-Amylases: Architectural Features and Key Residues in the Stabilization of the (ß/{alpha})8 Scaffold

Gerard PujadasGo, and Jaume Palau

Unitat de Biotecnologia Computacional, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Catalonia, Spain


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Speculations and Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
We provide a comprehensive analysis of the current enzymes with {alpha}-amylase activity (AAMYs) that belong to family 13 glycoside hydrolase (GH-13; 144 Archaea, Bacteria, and Eukaryota sequences from 87 different species). This study aims to further knowledge of the evolutionary molecular relationships among the sequences of their A and B domains with special emphasis on the correlation between what is observed in the structures and protein evolution. Multialignments for the A domain distinguish two clusters for sequences from Archaea organisms, eight for sequences from Bacteria organisms, and three for sequences from Eukaryota organisms. The clusters for Bacteria do not follow any strict taxonomic pathway; in fact, they are rather scattered. When we compared the A domains of sequences belonging to different kingdoms, we found that various pairs of clusters were significantly similar. Using either sequence similarity with crystallized structures or secondary-structure prediction methods, we identified in all AAMYs the eight putative ß-strands that constitute the ß-sheet in the TIM barrel of the A domain and studied the packing in its interior. We also discovered a "hidden homology" in the TIM barrel, an invariant Gly located upstream in the sequence before the conserved Asp in ß-strand 3. This Gly precedes an {alpha}-helix and is actively involved in capping its N-terminal end with a capping box. In all cases, a Schellman motif caps the C-terminal end of this helix.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Speculations and Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
{alpha}-Amylase (AAMY) (EC 3.2.1.1; 1,4-{alpha}-d-glucan glucanohydrolase) is an enzyme present in microorganisms and tissues from animals and plants. It catalyzes the hydrolysis of {alpha}-1,4-glucosidic bonds of glycogen, starch, related polysaccharides (such as the mainly linear amylose and the branched amylopectin), and some oligosaccharides using a retaining mechanism (the resulting hydroxyl group retains the {alpha}-configuration). The enzyme liberates {alpha}-maltose, {alpha}-glucose, and {alpha}-limit dextrins stepwise. AAMYs are extremely interesting from the biotechnological point of view, and thermostable enzymes from bacillus species are particularly important in the production of corn syrups and dextrose (Lee et al. 1991Citation ). AAMYs are distributed between families 13 (Archaea, Bacteria, and Eukaryota) and 57 (Archaea and Bacteria) of glycoside hydrolases (Henrissat 1991Citation ; Henrissat and Bairoch 1996Citation ). Despite their common enzyme activity, the sequences of these two families are not very similar (Janecek 1998Citation ). Henceforth, we will deal only with AAMYs from the first family, which is made up primarily of amylolytic enzymes and glycosyltransferases. The sequences studied include "true" AAMYs (with 3.2.1.1 as their single activity) and other sequences with 3.2.1.1—but not exclusive—activity (e.g., amylopullulanases). Evolutionary studies have shown that although all AAMYs possess the same catalytic function, their amino acid sequences are quite diverse (Nakajima, Imanaka, and Aiba 1986Citation ; Janecek 1992, 1994Citation ; Janecek et al. 1999Citation ). So far, a number of AAMYs from different species have been crystallized and analyzed by X-ray diffraction. The structure is formed by three domains: (1) a TIM barrel (domain A); (2) a long loop region inserted between ßA3 and {alpha}A3 (third ß-strand and {alpha}-helix in the A domain), known as domain B; and (3) the C domain at the end of the sequence. This study will focus on domains A and B (see fig. 1 ).



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 1.—Schematic drawing of the structure of the A and B domains of the Bacillus subtilis AAMY. {alpha}-helices are shown as spiral ribbons, whereas ß-strands are drawn as arrows from the amino end to the carboxy end of the ß-strand. The TIM barrel fold corresponding to the A domain is shown in light gray, and the B domain is shown in dark gray. The spacefill model of the maltopentaose ligand indicates the position of the active site (at the C-terminal side of the ß-sheet in the core of the TIM-barrel fold). A, An end view in which the C-terminal side of the ß-sheet is toward the reader. B, a side view in which the C-terminal side of the ß-sheet is toward the top of the page. The figure was produced with MOLSCRIPT, version 2.1.2 (Kraulis 1991Citation ), and the PDB file 1BAG (Fujimoto et al. 1998Citation )

 
In this article, we use current knowledge of the three-dimensional (3-D) structural features to present a computational study that provides insight into the evolutionary relationships among the different AAMYs (144 sequences, either from different biological species or from isozymes of a given species). We identify the eight short segments that belong to the eight putative ß-strands that constitute the ß-sheet in the TIM barrel domain and conduct a topological and evolutionary study of the layered residues in the interior of this ß-sheet. Finally, we analyze the N- and C-terminal capping in the {alpha}-helix that precedes ßA3.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Speculations and Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
Protein sequences deposited as of April 26, 1999, were imported from the Swiss-All database (formed by SwissProt, TrEMBL, and updates; Bairoch and Apweiler 1999Citation ) using the European Bioinformatic Institute Sequence Retrieval System (http://srs.ebi.ac.uk/). We considered only well-identified AAMY sequences (neither putative nor probable nor hypothetical) that belonged to glycoside hydrolases from family 13 (GH-13) and fully contained the characteristic A and B domains. All sequences were shortened to produce informationally similar segments that corresponded to the strict (ß/{alpha})8 barrel plus B domain (A+B segments) structure (see fig. 1 ). Therefore, neither the N-terminal tail nor domain C was considered. The amino acid segments of reference used to define the A+B sequences were obtained by analyzing the AAMY Protein Data Bank (PDB) structures (Berman et al. 2000). More specifically, we took the PDB sequences from the first residue in ßA1 to the last residue in ßA8 as our reference. The starting and finishing points for A+B segments in noncrystallized AAMYs were easily obtained by local alignments against the most similar PDB-derived segments. Redundancy for identical A+B sequences obtained from the same species was removed. The remaining sequences constituted our "full sample," which is made up of 144 sequences (7 from Archaea, 48 from Bacteria, and 89 from Eukaryota) from 87 different species. All taxonomic classifications of AAMY sources were done according to the Taxonomy database (http://www3.ncbi.nlm.nih.gov/Taxonomy/).

In a second stage, all the A+B segments were used to produce two different sets of sequences, one corresponding to the strict domain A and the other corresponding to domain B. Once again, the boundary between domains was obtained by visually inspecting all crystallized AAMYs. The sequence for the B domain goes from the highly conserved His residue located after ßA3 to four residues downstream of the Asp that binds to Ca2+ (just before {alpha}A3, the {alpha}-helix preceding the highly conserved ßA4). Domain A was obtained by joining the two subsegments on the left and right of domain B. For some sequences (e.g., Q05884 [StrLi]), the C-terminal boundary of domain B was difficult to find. In these special cases, we used the PSIpred server (http://insulin.brunel.ac.uk/psipred) to predict the secondary structure for the corresponding A+B sequences and identify where {alpha}A3 began. This server was chosen because at the latest Critical Assessment of Techniques for Protein Structure Prediction (CASP3; http://PredictionCenter.llnl.gov/casp3/), it was the most accurate method of secondary-structure prediction tested, achieving an overall three-state accuracy of 77% across 24 prediction targets. The PSIpred server was used (1) to find the locations of ß-strands in domain A of sequences with an overall low similarity with crystallized AAMYs and (2) to confirm the locations of secondary structures inferred from alignments with sequences from crystallized structures. Two prediction methods offered by the server were used extensively to test the coherence of the secondary-structure assignments. These were PSIpred for predicting protein secondary structure and GenTHREADER for predicting protein tertiary structure by fold recognition.

To avoid bias in our results due to highly similar isozymes of a given species in the full sample, we defined an AAMY "representative sample." The new sample therefore contained 7 sequences from Archaea, 44 from Bacteria, and 61 from Eukaryota and was made up of AAMYs from different biological species and isozymes with similarity indices (SI) below 95% for both the A and the B domains. The results presented in this paper—when not specifically indicated—are from the representative sample. The set of AAMY sequences that form this sample is shown in figure 2 , in which sequences may be identified by their Swiss-All accession numbers and the abbreviations for the names of the species are made up of the first three letters of the genus name followed by the first two letters of the species name (e.g., AerHy for Aeromonas hydrophila). Results from the full sample may be obtained as supplementary material from our web site (http://argo.urv.es/~pujadas/AAMY/AAMY_01).



View larger version (33K):
[in this window]
[in a new window]
 
  Fig. 2.—Clustering tree for AAMY domain A sequences. A total of 1,000 bootstrap replicates were performed (bootstrap values [%] are shown in bold). Branch lengths proportional to the divergence of the amino acid sequences are also shown. The sum of the lengths of the branches linking any AAMYs is a measure of the evolutionary distance between them. Labels for the different clusters and subclusters are shown in the right-hand margin. A, B, and C, the cluster distribution for Archaea, Bacteria, and Eukaryota, respectively. The asterisks in C indicate the outliers for the subclusters

 
GH-13 sequences from completed bacterial genomes (http://www.ebi.ac.uk/genomes) were obtained from the CAZy database (http://afmb.cnrs-mrs.fr/~pedro/CAZY/ghf_13.html). Analysis of a possible horizontal gene transfer (HGT) mechanism for the evolution of GH-13 in completed bacterial genomes was provided by S. Garcia-Vallvé (personal communication) and obtained by the method of Garcia-Vallvé, Palau, and Romeu (1999)Citation .

Crystallographic data retrieval, as well as sequence- and structure-derived information, were taken from the database and links in the Structure Explorer (http://www.rcsb.org/pdb/). The PDB entries for AAMY were as follows: 1BSI (Rydberg et al. 1999Citation ), 1B2Y (Qian et al. 1994Citation ), 1CPU (G. D. Brayer et al., personal communication), 1HNY (Brayer, Luo, and Withers 1995Citation ), and 1SMD (Ramasubbu et al. 1996Citation ) from Homo sapiens; 1BVN (Wiegand, Epp, and Huber 1995Citation ), 1DHK (Bompard-Gilles et al. 1996Citation ), 1JFH (Qian et al. 1997Citation ), 1OSE (Gilles et al. 1996Citation ), 1PIF (Machius et al. 1996Citation ), 1PIG (Machius et al. 1996Citation ), and 1PPI (Qian et al. 1994Citation ) from Sus scrofa; 1JAE (Strobl et al. 1998aCitation ), 1TMQ (Strobl et al. 1998bCitation ), and 1VIW (Nahoum et al. 1999Citation ) from Tenebrio molitor; 1AMY (Kadziola et al. 1994Citation ), 1AVA (Vallee et al. 1998Citation ), and 1BG9 (Kadziola et al. 1998Citation ) from Hordeum vulgare; 2AAA (Boel et al. 1990Citation ) from Aspergillus niger; 2TAA (Matsuura et al. 1984Citation ) and 6TAA and 7TAA (Brzozowski and Davies 1997Citation ) from Aspergillus oryzae; 1BLI (Machius et al. 1998Citation ), 1BPL (Machius, Wiegand, and Huber 1995Citation ), and 1VJS (Hwang et al. 1997Citation ) from Bacillus licheniformis; 1BAG (Fujimoto et al. 1998Citation ) from Bacillus subtilis; and 1AQH, 1AQM (Aghajari et al. 1998aCitation ) and 1B0I (Aghajari et al. 1998bCitation ) from Pseudoalteromonas haloplanctis. X-ray diffraction resolutions and R factors for these structures ranged from 1.6 to 3.2 Å and from 0.151 to 0.208, respectively. Although 1BVZ from Thermoactinomyces vulgaris (Kamitori et al. 1999Citation ) is considered an AAMY in its PDB file, a FASTA search (http://www2.ebi.ac.uk/fasta3/) showed that this enzyme is a neopullulanase (E.C. 3.2.1.135) whose sequence matches the NEPU_THEVU (Q08751) SwissProt entry exactly.

Multiple-sequence alignments were carried out for protein sequences of the working database with the CLUSTAL V algorithm (Higgins and Sharp 1989Citation ) and the commercial program MEGALIGN, version 3.16, from the Lasergene software package (1997; DNASTAR, Inc., London, England) running in a Power Macintosh. Initial dendrograms and SIs were calculated by applying available MEGALIGN subroutines that calculated the SI parameter between two sequences (i and j) based on the method of Wilbur and Lipman (1983)Citation with a gap penalty of 3, a K-tuple of 1, five top diagonals, and a window size of 5. The SI was calculated as the number of exactly matching residues in this alignment minus a "gap penalty" for every gap introduced. The result was then expressed as a percentage of the length of the shorter sequence. Multiple-alignment parameters (fixed and floating gap penalties) both had a value of 10. The protein weight matrix was PAM 250. We calculated the phylogenetic trees by the neighbor-joining method (Saitou and Nei 1987Citation ) with 1,000 bootstrap replicates and a seed value of 111 with the CLUSTAL X program, version 1.8 (Thompson et al. 1997Citation ). Unrooted trees were drawn with NJPLOT (Perrière and Gouy 1996Citation ).

Hydrogen bonds involved in helix-capping interactions at the N- and C-terminal ends of {alpha}A2, along with distances between donors and acceptors, were analyzed using HBPLUS (McDonald and Thornton 1994Citation ). The capping interactions were visually analyzed with the program Rasmol (Sayle and Milner-White 1995Citation ) using a Silicon Graphics Indigo2 XZ workstation. The DSSP algorithm included in Rasmol was used to determine the limits of ß-strands which were not included in the PDB files (e.g., some of the ß-strands in the TIM barrel of 1AQH), although their presence was obvious in the visualization.


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Speculations and Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
AAMY Sequences Are Grouped into Fairly Distant Clusters
Domain A sequences for Archaea, Bacteria, and Eukaryota were multialigned separately; these multialignments may be obtained from our web site or from the EMBL Sequence Alignment database (ftp://ftp.ebi.ac.uk/pub/databases/embl/align/). The codes in this database are DS42223 for Archaea, DS42225 for Bacteria, and DS42224 for Eukaryota. Figure 2 shows the clustering trees obtained from these multialignments. Despite the inherent difficulty of an alignment involving sequences with very low SIs, its reliability was based on (1) the use of structurally equivalent sequence segments, (2) the identification of the putative ß-strands that constituted the ß-sheet in the TIM barrel, and (3) the identification of the four conserved regions described by Nakajima, Imanaka, and Aiba (1986)Citation . Armed with this information, functionally and structurally equivalent sequence regions were multialigned ab initio, and it was therefore easier to align coils and helices a posteriori. Figure 2 shows that in AAMYs, clustering analysis distinguished two, eight, and three clusters for Archaea, Bacteria, and Eukaryota organisms, respectively. We followed the criterion that a cluster forms when the SI values of two or more domain A sequences from the same kingdom (Archaea, Bacteria, or Eukaryota) are over 25% for all pairwise comparisons among cluster members. This is the similarity threshold normally used to define representative protein data sets (Hobohm et al. 1992Citation ). Using this cluster definition, table 1 shows the ranges of SI and length for A+B, A, and B domain sequences when the multialignments were carried out independently, cluster by cluster. Results in table 1 indicate that although the same 3-D arrangement is expected for all domain A sequences (the TIM barrel fold), its length is highly variable (between 233 and 317 residues). For the B domain—with different types of fold (Janecek, Svensson, and Henrissat 1997Citation )—the length ranges from 34 to 104 (note the short length of domain B in cluster AI). The results in table 1 also reinforce the view that the evolution of domain B matches that of domain A (Janecek, Svensson, and Henrissat 1997Citation ); i.e., the range of SI values found for A or B domain sequences from the same cluster—built using a domain A similarity criterion—are generally equivalent.


View this table:
[in this window]
[in a new window]
 
Table 1 Similarity Index (SI) and Sequence Lengths for A + B, A, and B Domain Sequences if Multialignments are Carried Out Independently, Cluster by Cluster

 
Eight Alignable Segments, Each of Three Residues, Are Present in All AAMY Sequences
The packing of side chains within the closed ß-sheet has been described as one of the most important factors for keeping the TIM barrel structure (Lesk, Brändén, and Chothia 1989Citation ). Schematically, this packing may be described as a scaffold formed by a set of layers that are perpendicular to the barrel axis (Pujadas et al. 1996Citation ). Figure 3 describes the scaffold of BacLi AAMY (PDB code 1BPL; Machius, Wiegand, and Huber 1995Citation ). All crystallized AAMYs show a four-layered scaffold in the interior of their TIM barrel (see fig. 3 and visit the web page for the corresponding rasmol scripts). Figure 3 shows that each layer is made up of four residues (indicated by four black points at the same level) that belong to alternate strands (one residue for each strand) and whose side chains face toward the interior of the barrel (layers 2 and 3 are responsible for the side chain packing in the interior of the ß-sheet). The carbon atoms in the {alpha} positions of the four residues that form a layer are roughly located in the same plane. Each ß-strand contributes to two different layers with two alternate residues, whereas the side chain of the middle residue (in white in fig. 3 ) faces toward the external helices or coils.



View larger version (11K):
[in this window]
[in a new window]
 
Fig. 3.—Schematic description of the four-layered scaffold in the interior of the TIM barrel of BacLi AAMY (PDB code 1BPL; Machius, Wiegand, and Huber 1995Citation ) produced by cutting along the hydrogen bonds between ßA1 and ßA8 and further unrolling the ß-sheet on a flat surface. Only segments that are three residues long are shown for each ß-strand. Each layer is made up of four residues (indicated by four black points at the same horizontal level) that belong to alternate strands (one residue for each strand) and face toward the interior of the barrel. Each ß-strand contributes to two different layers with two alternate residues, whereas the middle residue (in white) faces towards the external helices or coils. The layers are perpendicular to the barrel axis

 
By comparing sequences with crystallized AAMYs or by using PSIpred routines, we extended our study to find the residues that putatively contribute to the four-layer scaffold in AAMYs of unknown 3-D structure. Table 2 shows, for each AAMY, a set of eight sequence segments that are three residues long. Each segment is part of one of the eight ß-strands that contributes to the scaffold (e.g., the first segment is part of ßA1, the second is part of ßA2, and so on). Residues in the same column on different lines are putatively located in equivalent positions along the AAMY structure. The combined information provided by figure 3 and table 2 for BacLi (cluster BV) may be used, for example, to infer the putative four-layered scaffold in all AAMYs. From this, we may infer from table 2 that (1) the first and third residues of each of the eight segments face the interior of the barrel and are members of two different layers, and (2) the second residue of the segments faces toward the external helices or coils.


View this table:
[in this window]
[in a new window]
 
Table 2 Sequence Similarities of Short Strips of AAMY Depicting the Segments of the Eight Putative Parallel {ß}-Strands Involved in Internal Packing of the {ß}-Sheet in the A Domain

 
From table 2 , we can evaluate the residue occurrence for each of the 16 positions involved in the scaffold (four positions for each layer). The residue occurrence for the full sample and for each AAMY cluster may be obtained from the web site. The main conclusion is that except for a few highly conserved residues present in nearly all putative AAMY TIM barrels and actively involved in catalysis or ion binding i.e. (Asp100 and Arg229[143 out of 144] and Glu261; see fig. 3 for sequence numbering), the sets of 16 structurally organized residues generally show cluster specificity (e.g., the ßA1 contribution to layer 4 is Gln for all plants [cluster EI] and His for all animals [EIII]).

Taxonomic Distribution of AAMYs in the Clusters
Figure 2A shows that AAMYs from Archaea are distributed in two clusters: AI, which contains all of the Crenarchaeota sequences, and AII, which contains all of the Euryarchaeota sequences. The cluster distribution of the Archaea sequences is therefore closely related to the taxonomic grouping. At this point, we would also like to mention that considering sequences from cluster AI "true" AAMYs is controversial (S. Janecek, personal communication). Although it appears that they are able to attack the starch (Kato et al. 1996Citation ), their investigated substrate/product specificity and mode of action seem different from those found in ordinary AAMYs. Moreover, GenBank (Benson et al. 2000) provides two sequences under two different accession numbers—one designated as AAMY and equivalent to SwissProt entry Q53641 (D64131), and the second designated as maltooligosyltrehalose trehalohydrolase TreZ gene (D83245)—whose complete alignment shows that they are 100% identical. Despite these contradictory facts, we included the three Sulfolobus sequences in our study because it seems clear that to some extent they show 3.2.1.1 activity (Kato et al. 1996Citation ).

The tree for Bacteria displays eight clusters and one sequence that appear to be solitary from an evolutionary point of view (P14898[DicTh]; see fig. 2B ). In contrast to the Archaea results, there is no general correspondence between cluster distribution and taxonomy for Bacteria (see table 3 ). In this sense, the distribution of AAMY sequences within clusters appears to be rather scattered. For instance, AAMYs from Firmicutes Actinobacteria Actinobacteridae may be found in clusters BI (2 out of 2) and BVIII (9 out of 13); Firmicutes Actinobacteria Thermoactinomyces are distributed between clusters BII (1 out of 5) and BIV (1 out of 2); Firmicutes from the bacillus/clostridium group may be found in clusters BII (4 out of 5), BIV (1 out of 2), BV (7 out of 9), BVI (7 out of 7), and BVIII (1 out of 13) and outside any cluster (1). Moreover, Proteobacteria AAMYs from the gamma subdivision are scattered between clusters BIII (1 out of 2), BV (2 out of 9), BVII (3 out of 3), and BVIII (3 out of 13). The only Thermotogales AAMY (P96107) falls inside cluster BIII. As we can see in table 3 , if a more extended taxonomic classification is used, the scattering between clusters remains (e.g., Bacillaceae/Bacillus AAMYs are scattered among clusters BV and BVI). There is some scattered clustering in the unrooted evolutionary tree for Bacteria described by Janecek (1994)Citation (i.e., the clustering of Proteobacteria and Firmicutes [EscCo and SalTy with BacSt, BacAm, and BacLi; PseHa with StrHy, StrTl, StrVl, StrLi, StrGr, and TheCu]), but this phenomenon is described more accurately in this study. Moreover, cluster BV simultaneously contains AAMYs from Firmicutes and Proteobacteria species (this is also true for cluster BVIII; see table 3 ). Figure 2B also shows that poorly related AAMY sequences frequently coexist in the same species: (1) P22630 (cluster BVII) and P41131 (cluster BVIII) from AerHy; (2) Q52413 (cluster BVIII) and Q52414 (cluster BVII) from PseSp; (3) Q56791 (cluster BVII) and Q60102 (cluster BIII) from XanCa; (4) Q05884 (cluster BI) and P97179 (cluster BVIII) from StrLi; (5) Q60051 (cluster BIV) and Q60053 (cluster BII) from TheVu; and (6) O50583/Q53786 (cluster BV) and O50582 (cluster BVI) from StrBo.


View this table:
[in this window]
[in a new window]
 
Table 3 Scattering of Bacteria AAMY Sequences in Clusters (classified under taxonomic criteria)

 
In figure 2C, we can see three well-defined clusters which correspond to the evolutionary trees of Eukaryota covering plants (Viridiplantae; EI), fungi (EII), and animals (Metazoa; EIII). A more detailed analysis of the results in figure 2C shows—for all three clusters—an excellent agreement between phylogenetic tree distribution and taxonomy. This is reflected by a set of compact and taxonomically self-consistent subclusters (EI1 for Eudicotyledons, EI2 for Liliopsida, EII1 for Archiascomycetes, EII2 for Hemiascomycetes, EII3 for Euascomycetes, EIII1 for Diptera, EIII2 for Coleoptera/Lepidoptera, EIII3 for Mollusca, and EIII4 for Chordata). The only outliers found for these subclusters are P53354 (AedAe), Q41442 (SolTu), and Q92394 (CryCo). The first two are quite separate from the other sequences from species with a similar taxonomy (Diptera [EIII1] and Eudicotyledons [EI1] respectively). The third outlier—a fungus Basidiomycota—is inserted between the Ascomycota subclusters.

Interkingdom Relationships Between Clusters
Long et al. (1987)Citation , Janecek (1994)Citation , and Janecek et al. (1999)Citation also discovered the linkage between some of the AAMYs that belong to the cluster pairs BVIII/EIII, AII/BV, and AII/EI. Linkage of the BVIII/EIII pair is confirmed by our results with domain A sequences (SI ranges from 27% to 38%), although both clusters significantly increased their members (see fig. 4A ). With respect to the AII/BV relationship described by Janecek et al. (1999)Citation , we found that the SI values of some members of clusters BV and AII were also around 30% when their domain A sequences were compared (e.g., the SI value for O50200 [TheSp] and P06278 [BacLi] was 32.7%; see fig. 4B ). Moreover, one of the residues considered by Janecek et al. (1999)Citation to be highly characteristic of Archaea AAMYs from cluster AII—i.e., Trp in position i+2 from the ßA5 catalytic Glu—was also found in all BV members. This confirms Janecek et al.'s (1999)Citation results for EscCo (P26612) and BacLi (P06278). These authors have suggested that this Trp plays the same role in the active site of AAMYs from cluster AII as the equivalent Trp in the HorVu AAMY (it forms a stacking interaction with one of the acarbose rings bound in the active site; Kadziola et al. 1998Citation ). Therefore, by drawing a parallel with Janecek et al.'s conclusions, we may expect the same role for the equivalent Trp in BV AAMYs. On the other hand, we did not detect a set of homogeneous SI values when comparing domain A sequences from clusters AII and EI (SI ranges from 27.2% to 17.1%). Nevertheless, it is clear that this relationship exists at a more localized level (see fig. 1 in Janecek et al. 1999Citation ). By comparing all clusters, we also detected a strong similarity between domain A sequences from clusters BIV and EII (see fig. 4C ). The range for the SI values obtained from all of the comparisons between members of BIV and EII is 27%–38%. The relationship between BIV and EII involves one newly described cluster (BIV) and therefore cannot be inferred from Janecek's et al. (1999)Citation results. Consequently, AII, BV, and EI would appear to share the same common ancestor (and the same is also true for BVIII and EIII and for BIV and EII). Moreover, the common ancestor for AII/BV/EI, the one for BVIII/EIII, and the one for BIV/EII are different. This would suggest that the divergence between these three ancestors was a very early event in AAMY evolution. Interkingdom multialignments for the relationships between clusters that are discussed in this section may be obtained from our website or from the EMBL Sequence Alignment database. The codes in this database are DS43802 for AII/BV/EI, DS43803 for BIV/EII, and DS43804 for BVIII/EIII multialignments.



View larger version (47K):
[in this window]
[in a new window]
 

 
Only those domain B sequences for the BVIII/EIII pair were similar (e.g., the SI value for Q16924 [AedAt] and P41131 [AerHy] was 49.2%). It is also worth mentioning that although domain A sequences from clusters BV and AII were comparable (see above), the B domain for BV members was approximately twice as long as for AII (see table 1 ). These two factors support our choice of structurally equivalent sequence segments—the A domain—as the basis for AAMY clustering analysis.

We have therefore made a classification of known AAMYs which is based on an objective definition of a cluster (see above). This definition describes the most important known AAMY relationships (Long et al. 1987Citation ; Janecek 1994Citation ; Janecek et al. 1999Citation ) and is useful for finding, in a systematic way, (1) new clusters (AI, BII, and BIV), (2) undescribed relationships between clusters (BIV/EII), and (3) scattering of bacteria between different clusters irrespective of the taxonomy.

Conserved Sequence Segments in the A Domain and Structural Analysis of the N- and C-Terminal Capping of Helix {alpha}A2
Table 4 shows four motifs—written in PROSITE syntax (Hofmann et al. 1999Citation )—corresponding to the four most conserved sequence segments in the A domain of AAMYs. The motifs were found by ocular inspection of the regions of the three multialignments (Archaea, Bacteria, and Eukaryota) where the sequences were highly similar. The multialignments were constructed in this case with the full sample (144 sequences), and the numbers show the residue occurrence in the representative sample (112 sequences). Note that the four conserved segments bundle at the C-terminal end of the barrel scaffold (where the core of the enzyme activity is located; see fig. 1 ).


View this table:
[in this window]
[in a new window]
 
Table 4 The Four Most Conserved Sequence Segments in the A Domain of AAMYs

 
In the L2H2L'2ß3L3 motif, we included the strictly conserved residue Gly, located 22 residues upstream—26 in the case of Q60053 (TheVu)—in the sequence, before the highly conserved residue Asp (ßA3) that is important for the substrate binding site. Structural analysis of the role of this Gly in crystallized AAMYs showed us that in all cases, it is actively involved in capping the N-terminal end of helix {alpha}A2. This capping is of the "capping box" type (Harper and Rose 1993Citation ), where Gly occupies the N' position (see fig. 5A ). Main chain dihedral angle values for the GlyN' are well preserved for all AAMYs in a position of the Ramachandran map that is favored only for Gly residues ({phi} = 144.1 ± 5.7°; {psi} = -172.2 ± 10.0°). A similar degree of main chain dihedral angle preservation is observed for the residues in the Ncap position ({phi} = -101.5 ± 7.7°; {psi} = 176.4 ± 8.9°). At this point, we may wonder why other residues, like Met, Leu, or Ile, that are the most common at N' in capping boxes (Aurora and Rose 1998Citation ) have not been selected throughout evolution by at least some AAMYs to cap the N-terminal end of the helix {alpha}A2. These residues would allow the typical hydrophobic interaction found in capping boxes between residues in the N' and N4 positions. In the case of AAMY, the Gly residue in N' is unable to establish the hydrophobic interaction with the residue located in N4 (Leu, Phe, and Tyr), and alternative hydrophobic partners near N4 have not been found. Aurora and Rose (1998)Citation reported a statistical analysis of all capping boxes that are found in a nonrepetitive PDB subset formed by 274 polypeptide chains. In their study, Gly was not the "best" general option for the N' position, and the preferred {phi} and {psi} values for N' ({phi} = -102 ± 12°; {psi} = 140 ± 25°) and for Ncap ({phi} = -86 ± 12°; {psi} = 150 ± 25°) differ from those found in AAMYs. From all of these data, we may speculate that the presence of Gly—the residue with the largest main chain conformational compliance—at the N' position of AAMYs is important for deviating the {phi} and {psi} of Ncap from common values and allowing a proper orientation of the Ncap residue and the subsequent formation of the capping box. This would explain why Gly at N' has been strictly preserved throughout AAMY evolution.



View larger version (44K):
[in this window]
[in a new window]
 
  Fig. 5.—Examples of the capping in the N- and C-terminal ends of helix {alpha}A2 in crystallized AAMYs. The PDB codes for the different structures, the AAMY clusters in which they are included, the residues involved in the capping, and the hydrogen bond distances are also shown. The arrows indicate the proton donor–proton acceptor direction in hydrogen bonds involved in the capping. The first (between residues in the Ncap and N4 positions) and last (between residues in the C4 and Ccap positions) backbone hydrogen bonds of the {alpha}-helix are indicated by solid lines. Nomenclature for helices and their flanking residues is also shown at the top of the figure. A, The capping at the N-termini is of the "capping box" type, with a Gly always in the N' position. This capping is characterized by a hydrogen bond pattern involving the side chain of the residue in the Ncap position (proton acceptor), with the proton from backbone amide in N3, and, reciprocally, the side chain of N3 acts as a proton acceptor for the amide hydrogen of the residue in Ncap. The side chain of the residue in Ncap usually acts as a "continuation" of the helix backbone, so there is an almost {alpha}-helical hydrogen bond geometry for the Ncapside chain–N3main chain interaction. Typically, the capping box also shows a hydrophobic interaction between residues in N' and N4 (Aurora and Rose 1998Citation ). The hydrogen bonds between residues in the Ncap and N3 positions are indicated by arrows (Ncapmain chain–N3side chain and Ncapside chain–N3main chain as dashed and solid arrows, respectively). B, The capping at the C-termini is of the "Schellman motif" type. This motif is characterized by a residue in the C' position that can adopt a left-handed {alpha}-helical conformation (i.e., a residue with a positive value for {phi}; typically Gly, although non-ß-branched residues are also allowed) and a hydrogen-bond pattern involving the following main chain atoms: (1) the carbonil oxygen of C3 that accepts the proton from the amide of the residue in C'' and (2) the carbonil oxygen of C2 that accepts the proton from the amide hydrogen of the residue in C'. Typically, the Schellman motif also shows a hydrophobic interaction between residues in C3 and C'' (Aurora and Rose 1998Citation ). The hydrogen bonds that define the motif are indicated by arrows (C3–C'' and C2–C' as dashed and solid arrows, respectively)

 
The Schellman (1980)Citation motif is invariably used by all crystallized AAMYs to cap the C-terminal end of helix {alpha}A2 (see fig. 5B ). Comparison of the characteristics of the {alpha}A2 Schellman motif found in BacLi (1VJS; cluster BV) with the other AAMY structures shows that if Gly at C' is substituted by Asp, the hydrogen bond pattern, the main chain conformation of the C' residue, and the hydrophobic interaction, which define the motif, remain unchanged (see fig. 5B ). For example, in AAMYs, the {phi} and {psi} values in the C' position are 79.2 ± 7.3° and 18.4 ± 9.4°, respectively, for Gly and 66.6 ± 2.2° and 30.3 ± 9.4°, respectively, for Asp (both pairs of mean values in the left-handed {alpha}-helical conformation).

All noncrystallized AAMYs show sequences that are compatible with a capping box and a Schellman motif for the N- and C-terminal ends of {alpha}A2, respectively. From a sequence point of view, the {alpha}A2 Schellman motif found in AAMYs is more heterogeneous than the capping box that begins at the same helix (e.g., the C' position is occupied by eight different residues in our AAMY sample). The length of {alpha}A2 defined by both capping motifs is 15 residues for all AAMYs except Q60053 (TheVu). The structural preservation of this helix throughout evolution is interesting and requires new experimental approaches to determine more precisely why this helix is so important in AAMYs.

Using various (ß/{alpha})8 barrel enzymes, Janecek (1996)Citation defined the concept of "hidden homology" as a conserved region that is more or less preserved throughout evolution in the equivalent part of the structure of the other enzymes that share this folding motif. Following this definition, we studied whether the same type of capping box and Schellman motif were also found in the other crystallized TIM barrels from GH-13 (Pujadas and Palau 1999Citation ; http://argo.urv.es/~pujadas/TIM). The structures analyzed have the following enzyme activities: 2.4.1.19 (cyclomaltodextrin glucanotransferase), 3.2.1.10 (oligo-1,6-glucosidase), 3.2.1.60 (glucan 1,4-{alpha}-maltotetrahydrolase), 3.2.1.68 (isoamylase), and 3.2.1.135 (neopullulanase). Our results demonstrate that in all cases, the structure of both capping motifs, the length of the helix {alpha}A2, and the number of residues between the invariable Gly and Asp are the same as those found in AAMYs. The numberings for GlyN' in the capping box, for the C' residue in the Schellman motif, and for the highly conserved Asp (ßA3) in some PDB sequences which are representative of the above-mentioned enzyme activities are Gly114-Asn130-Asp136 (1CIU; Knegtel et al. 1996Citation ), Gly76-Asn92-Asp98 (1UOK; Watanabe et al. 1997Citation ), Gly90-Gly106-Asp112 (2AMG; Morishita et al. 1997Citation ), Gly270-Gly286-Asp292 (1BF2; Katsuya et al. 1998Citation ), and Gly217-Gly233-Asp239 (1BVZ; Kamitori et al. 1999Citation ). We may therefore conclude that the first part of the L2H2L'2ß3L3 motif—from Gly to Asp—may be considered a strictly conserved structural motif in GH-13. Whether this motif is a "hidden homology" sensu stricto must be corroborated by finding other TIM barrel proteins which have no apparent sequence relationship with those from GH-13 but have (1) an {alpha}-helix 15 residues long that is limited at its N-terminal end by a capping box with an invariant Gly in N' position and at its C-terminal end by a Schellman motif and (2) an invariant Asp located at a constant distance from GlyN'.

Rasmol scripts that focus on the above-mentioned capping arrangements for AAMYs and other crystallized TIM barrels from GH-13 can be found on our website.


    Speculations and Conclusions
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Speculations and Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
The old idea that protein folding is better preserved than sequence throughout evolution (Schulz and Schirmer 1979Citation ) is fully accepted nowadays. The A domain of AAMY sequences is an excellent example of the preservation of folding—the TIM barrel—with a simultaneous general loss of sequence similarity. In fact, the gene sequences have diverged throughout evolution from the "first AAMY" gene in such a way that only the four segments indicated in table 4 are easily identified when AAMYs that belong to unrelated clusters are compared. As we have also reported, the structural features of the helix that joins the conserved residues Gly (LA2) and Asp (ßA3) are also extraordinarily well preserved.

Our results suggest a general hypothesis for the evolution of domain A in AAMYs which involves two waves of evolutionary events. In the first wave, a limited set of genes strongly diverged from their common ancestral, or first AAMY, gene. These "first-generation" AAMYs probably had very low SI between them, had different lengths, and were the precursors of some of the unrelated clusters shown in figure 2 . Therefore, the only characteristics that the first-generation AAMYs preserved from the original gene were (1) the four segments indicated in table 4 , (2) the 3-D structure, and (3) the characteristics of helix {alpha}A2. Some examples of these first-generation AAMYs are the common ancestor for AII/BV/EI, that for BVIII/EIII, and that for BIV/EII. Each of these first-generation AAMYs would further evolve in a second wave to produce the sequences that form each cluster (each first-generation AAMY gives a different cluster). These "second-generation" sequences diverge only slightly when compared with those of the first wave and make up the contemporaneous AAMYs. During this second wave, HGT could generate the interkingdom similarities between clusters (AII/BV/EI, BIV/EII, and BVIII/EIII). In this sense, Mazodier and Davies (1991)Citation suggested that the sequence similarity between the AAMY from StrLm (P09794; cluster BVIII) and those from mammalian and invertebrates (cluster EIII) found by Long et al. (1987)Citation may be proof of natural gene transfer between distantly related organisms. Mazodier and Davies (1991)Citation suggested that the HGT direction would be from Eukaryota to Streptomyces.

The clustering tree for Bacteria shows two characteristics that are not found in the trees for Archaea and Eukaryota: (1) the scattering in different clusters of AAMY sequences either from the same species or from closely related species (based on taxonomic grounds) and its opposite (the grouping of AAMYs from very distantly related organisms), and (2) that poorly related AAMY sequences frequently coexist in the genomes of some species and are included in different clusters. Following the above-mentioned general hypothesis for the evolution of domain A in AAMYs, we suggest that the evolutionary events that gave rise to this special distribution of bacterial AAMYs occurred during the first wave. This hypothesis is based on the fact that the sequences affected by these evolutionary phenomena are fully coherent with the characteristics of the rest of the sequences in the cluster. Therefore, it proves that they share the same common ancestor.

We may wonder which evolutionary events are responsible for the fact that the classification of the bacterial AAMY genes is not coherent with the current classification of the species. Probably, the first AAMY gene evolved to give rise to the limited set of first-generation AAMYs in two different ways: (1) gene duplications followed by independent parsimonious evolution, and (2) HGT. In addition to AAMY, the GH-13 superfamily comprises another 18 enzyme activities. There is functional (Kuriki and Imanaka 1999Citation ) and sequence-based evidence (del-Rio, Morett, and Soberon 1997; Garcia-Vallvé, Palau, and Romeu 1999Citation ) that the "first" GH-13 in a genome can give rise to a set of paralogs through massive gene duplication. A posteriori, these sequences can evolve by independent parsimonious evolution and acquisition of subtly different specificities to obtain the rest of GH-13 in the genome. Nevertheless, new glycoside hydrolase genes may also be acquired by a genome in a radically different way: by HGT from an exogen organism (Mazodier and Davies 1991Citation ; Garcia-Vallvé, Palau, and Romeu 1999Citation ; Garcia-Vallvé, Romeu, and Palau 2000Citation ). Figure 2B shows that poorly related AAMY sequences coexist in the genomes of some bacterial species. Most probably, the same is also true for other bacteria, but for the moment, only one gene has been characterized. Therefore, the use of completed bacterial genomes could help us to discover if there is more than one AAMY gene and to test their mutual evolutionary relationships, i.e., to determine whether they are paralogs or one of them has arrived by HGT. Since function assignment for genome-derived sequences is usually obtained by sequence comparison with proteins of known function and not from biochemical analysis, such a study should not be restricted to AAMYs and therefore should consider all GH-13 genes. Only 11 out of 25 completed bacterial genomes (Aquifex aeolicus, Bacillus subtilis, Chlamydia muridarum, Chlamydia pneumoniae, Chlamydia trachomatis, Deinococcus radiodurans, Escherichia coli, Haemophilus influenzae, Mycobacterium tuberculosis, Synechocystis sp., and Thermotoga maritima) have at least one GH-13 (93 genes). According to Garcia-Vallvé, Palau, and Romeu (1999)Citation , GH-13 sequences in E. coli and B. subtilis seem to be the product of the gene duplication of a common ancestor not arrived at by HGT to their genomes (and the same is valid for the GH-13 of the rest of completed bacterial genomes; S. Garcia-Vallvé, personal communication). The bacterial genomes, where poorly related AAMY sequences coexist (i.e., AerHy, PseSp, XanCa, StrLi, TheVu, and StreBo), have not yet been completed, and therefore only partial information is known. Once completed, it would be of interest to study whether these genes are paralogs or the result of HGT. Such information would allow us to fill in the gaps of the story of AAMY's evolution.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Speculations and Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
Supplementary material may be obtained from our website (http://argo.urv.es/~pujadas/AAMY/AAMY_01).


View this table:
[in this window]
[in a new window]
 
Table 2 Continued

 


View larger version (47K):
[in this window]
[in a new window]
 

 


View larger version (30K):
[in this window]
[in a new window]
 
Fig. 4.—The most significant inter-kingdom similarities between domain A clusters. A total of 1,000 bootstrap replicates were performed (bootstrap values [%] are shown in bold). Branch lengths proportional to the divergence of the amino acid sequences are also shown. The sum of the lengths of the branches linking any AAMYs is a measure of the evolutionary distance between them. Labels for the different clusters that are similar are shown in the right-hand margin of the figure

 

    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Speculations and Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 
This work has not been awarded grants by any research-supporting institution. We would like to thank Prof. Enric Querol for computational facilities, Dr. Stefan Janecek for his critical reading of the manuscript before submission, Drs. Santiago Garcia-Vallvé and Guy Perrière for valuable information on bacterial genomes evolution, and Kevin Costello (from the Language Service of our University) for his help during the writing of this manuscript.


    Footnotes
 
Manolo Gouy, Reviewing Editor

1 Abbreviations: 3-D, three-dimensional; AAMY, {alpha}-amylase; GH-13, glycoside hydrolase from family 13; HGT, horizontal gene transfer; PDB, Protein Data Bank; SI, similarity index. Back

2 Keywords: {alpha}-amylase TIM barrel protein evolution helix capping structural phylogeny Back

3 Address for correspondence and reprints: Gerard Pujadas, Unitat de Biotecnologia Computacional, Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, Tarragona 43005, Catalonia, Spain. E-mail: pujadas{at}quimica.urv.es Back


    literature cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Speculations and Conclusions
 Supplementary Material
 Acknowledgements
 literature cited
 

    Aghajari, N., G. Feller, C. Gerday, and R. Haser. 1998a. Crystal structures of the psychrophilic {alpha}-amylase from Alteromonas haloplanctis in its native form and complexed with an inhibitor. Protein Sci. 7:564–572.

    ———. 1998b. Structures of the psychrophilic Alteromonas haloplanctis {alpha}-amylase give insights into cold adaptation at a molecular level. Structure 6:1503–1516.

    Aurora, R., and G. D. Rose. 1998. Helix capping. Protein Sci. 7:21–38.[Abstract/Free Full Text]

    Bairoch, A., and R. Apweiler. 1999. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999. Nucleic Acids Res. 27:49–54.[Abstract/Free Full Text]

    Benson, D. A., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, B. A. Rapp, and D. L. Wheeler. 2000. GenBank. Nucleic Acids Res. 28:15–18.[Abstract/Free Full Text]

    Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne. 2000. The Protein Data Bank. Nucleic Acids Res. 28:235–242.[Abstract/Free Full Text]

    Boel, E., L. Brady, A. M. Brzozowski, Z. Derewenda, G. G. Dodson, V. J. Jensen, S. B. Petersen, H. Swift, L. Thim, and H. F. Woldike. 1990. Calcium binding in {alpha}-amylases: an X-ray diffraction study at 2.1 Å resolution of two enzymes from Aspergillus. Biochemistry 29:6244–6249.

    Bompard-Gilles, C., P. Rousseau, P. Rouge, and F. Payan. 1996. Substrate mimicry in the active center of a mammalian {alpha}-amylase: structural analysis of an enzyme-inhibitor complex. Structure 4:1441–1452.

    Brayer, G. D., Y. G. Luo, and S. G. Withers. 1995. The structure of human pancreatic {alpha}-amylase at 1.8 Å resolution and comparisons with related enzymes. Protein Sci. 4:1730–1742.[Abstract/Free Full Text]

    Brzozowski, A. M., and G. J. Davies. 1997. Structure of the Aspergillus oryzae {alpha}-amylase complexed with the inhibitor acarbose at 2.0 Å resolution. Biochemistry 36:10837–10845.

    Del-Rio, G., E. Morett, and X. Soberon. 1997. Did cyclodextrin glycosyltransferases evolve from {alpha}-amylases?. FEBS Lett. 416:221–224.[ISI][Medline]

    Fujimoto, Z., K. Takase, N. Doui, M. Momma, T. Matsumoto, and H. Mizuno. 1998. Crystal structure of a catalytic-site mutant {alpha}-amylase from Bacillus subtilis complexed with maltopentaose. J. Mol. Biol. 277:393–407.[ISI][Medline]

    Garcia-Vallvé, S., J. Palau, and A. Romeu. 1999. Horizontal gene transfer in glycosyl hydrolases inferred from codon usage in Escherichia coli and Bacillus subtilis. Mol. Biol. Evol. 16:1125–1134.[Abstract]

    Garcia-Vallvé, S., A. Romeu, and J. Palau. 2000. Horizontal gene transfer of glycosyl hydrolases of the rumen fungi. Mol. Biol. Evol. 17:352–361.[Abstract/Free Full Text]

    Gilles, C., J. P. Astier, G. Marchis-Mouren, C. Cambillau, and F. Payan. 1996. Crystal structure of pig pancreatic {alpha}-amylase isoenzyme II, in complex with the carbohydrate inhibitor acarbose. Eur. J. Biochem. 238:561–569.[Abstract]

    Harper, E. T., and G. D. Rose. 1993. Helix stop signals in proteins and peptides: the capping box. Biochemistry 32:7605–7609.

    Henrissat, B. 1991. A classification of glycosyl hydrolases based on amino-acid sequence similarities. Biochem. J. 280:309–316.[ISI][Medline]

    Henrissat, B., and A. Bairoch. 1996. Updating the sequence-based classification of glycosyl hydrolases. Biochem. J. 316:695–696.[ISI][Medline]

    Higgins, D. G., and P. M. Sharp. 1989. Fast and sensitive multiple sequence alignments on a microcomputer. CABIOS 5:151–153.

    Hobohm, U., M. Scharf, R. Schneider, and C. Sander. 1992. Selection of representative protein data sets. Protein Sci. 1:409–417.[Abstract/Free Full Text]

    Hofmann, K., P. Bucher, L. Falquet, and A. Bairoch. 1999. The PROSITE database, its status in 1999. Nucleic Acids Res. 27:215–219.[Abstract/Free Full Text]

    Hwang, K. Y., H. K. Song, C. Chang, J. Lee, S. Y. Lee, K. K. Kim, S. Choe, R. M. Sweet, and S. W. Suh. 1997. Crystal structure of thermostable {alpha}-amylase from Bacillus licheniformis refined at 1.7 Å resolution. Mol. Cells 7:251–258.

    Janecek, S. 1992. New conserved amino acid region of {alpha}-amylases in the third loop of their (ß/{alpha})8 barrel domains. Biochem. J. 288:1069–1070.[ISI][Medline]

    ———. 1994. Sequences similarities and evolutionary relationships of microbial, plant and animal {alpha}-amylases. Eur. J. Biochem. 224:519–524.[Abstract]

    ———. 1996. Invariant glycines and prolines flanking in loops the strand ß2 of various ({alpha}/ß)8-barrel enzymes: a hidden homology? Protein Sci. 5:1136–1143.

    ———. 1998. Sequence of archaeal Methanococcus jannaschii {alpha}-amylase contains features of families 13 and 57 of glycosyl hydrolases: a trace of their common ancestor? Folia Microbiol. 43:123–128.

    Janecek, S., E. Leveque, A. Belarbi, and B. Haye. 1999. Close evolutionary relatedness of {alpha}-amylases from Archaea and plants. J. Mol. Evol. 48:421–426.[ISI][Medline]

    Janecek, S., B. Svensson, and B. Henrissat. 1997. Domain evolution in the {alpha}-amylase family. J. Mol. Evol. 45:322–331.[ISI][Medline]

    Kadziola, A., J. Abe, B. Svensson, and R. Haser. 1994. Crystal and molecular structure of barley {alpha}-amylase. J. Mol. Biol. 239:104–121.[ISI][Medline]

    Kadziola, A., M. Sogaard, B. Svensson, and R. Haser. 1998. Molecular structure of a barley {alpha}-amylase-inhibitor complex: implications for starch binding and catalysis. J. Mol. Biol. 278:205–217.[ISI][Medline]

    Kamitori, S., S. Kondo, K. Okayama, T. Yokota, Y. Shimura, T. Tonozuka, and Y. Sakano. 1999. Crystal structure of Thermoactinomyces vulgaris R-47 {alpha}-amylase II (TVAII) hydrolyzing cyclodextrins and pullulan at 2.6Å resolution. J. Mol. Biol. 287:907–921.[ISI][Medline]

    Kato, M., Y. Miura, M. Kettoku, T. Komeda, A. Iwamatsu, and K. Kobayashi. 1996. Reaction mechanism of a new glycosyltrehalose-hydrolyzing enzyme isolated from the hyperthermophilic archaeum, Sulfolobus solfataricus KM1. Biosci. Biotech. Biochem. 60:925–928.[ISI]

    Knegtel, R. M., R. D. Wind, H. J. Rozeboom, K. H. Kalk, R. M. Buitelaar, L. Dijkhuizen, and B. W. Dijkstra. 1996. Crystal structure at 2.3 Å resolution and revised nucleotide sequence of the thermostable cyclodextrin glycosyltransferase from Thermonanaerobacterium thermosulfurigenes EM1. J. Mol. Biol. 256:611–622.[ISI][Medline]

    Katsuya, Y., Y. Mezaki, M. Kubota, and Y. Matsuura. 1998. Three-dimensional structure of Pseudomonas isoamylase at 2.2 Å resolution. J. Mol. Biol. 281:885–897.[ISI][Medline]

    Kraulis, P. J. 1991. MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24:946–950.[ISI]

    Kuriki, T., and T. Imanaka. 1999. The concept of the {alpha}-amylase family: structural similarity and common catalytic mechanism. J. Mol. Biol. 157:105–132.

    Lee, S. Y., S. Kim, R. M. Sweet, and S. W. Suh. 1991. Crystallization and a preliminary X-ray crystallographic study of {alpha}-amylase from Bacillus licheniformis. Arch. Biochem. Biophys. 291:255–257.

    Lesk, A. M., C. I. Brändén, and C. Chothia. 1989. Structural principles of {alpha}/ß barrel proteins: the packing of the interior of the sheet. Proteins 5:139–148.

    Long, C. M., M. J. Virolle, S. Y. Chang, S. Chang, and M. J. Bibb. 1987. {alpha}-Amylase gene of Streptomyces limosus: nucleotide sequence, expression motifs, and amino acid sequence homology to mammalian and invertebrate {alpha}-amylase. J. Bacteriol. 169:5745–5754.[ISI][Medline]

    McDonald, I. K., and J. M. Thornton. 1994. Satisfying hydrogen bonding potential in proteins. J. Mol. Biol. 238:777–793.[ISI][Medline]

    Machius, M., N. Declerck, R. Huber, and G. Wiegand. 1998. Activation of Bacillus licheniformis {alpha}-amylase through a disorder->order transition of the substrate-binding site mediated by a calcium-sodium-calcium metal triad. Structure 6:281–292.

    Machius, M., L. Vertesy, R. Huber, and G. Wiegand. 1996. Carbohydrate and protein-based inhibitors of porcine pancreatic {alpha}-amylase: structure analysis and comparison of their binding characteristics. J. Mol. Biol. 260:409–421.[ISI][Medline]

    Machius, M., G. Wiegand, and R. Huber. 1995. Crystal structure of calcium-depleted Bacillus licheniformis {alpha}-amylase at 2.2 Å resolution. J. Mol. Biol. 246:545–559.[ISI][Medline]

    Matsuura, Y., M. Kusunoki, W. Harada, and M. Kakudo. 1984. Structure and possible catalytic residues of Taka-amylase A. J. Biochem. 95:697–702.[Abstract]

    Mazodier, P., and J. Davies. 1991. Gene transfer between distantly related bacteria. Annu. Rev. Genet. 25:147–171.[ISI][Medline]

    Morishita, Y., K. Hasegawa, Y. Matsuura, Y. Katsube, M. Kubota, and S. Sakai. 1997. Crystal structure of a maltotetraose forming exo-amylase from Pseudomonas stutzeri. J. Mol. Biol. 267:661–672.

    Nahoum, V., F. Farisei, V. Le-Berre-Anton, M. P. Egloff, P. Rouge, E. Poerio, and F. Payan. 1999. A plant-seed inhibitor of two classes of {alpha}-amylases: X-ray analysis of Tenebrio molitor larvae {alpha}-amylase in complex with the bean Phaseolus vulgaris inhibitor. Acta Crystallogr. D 55:360–362.

    Nakajima, R., T. Imanaka, and S. Aiba. 1986. Comparison of amino acid sequences of eleven different {alpha}-amylases. Appl. Microbiol. Biotechnol. 23:355–360.[ISI]

    Perrière, G., and M. Gouy. 1996. WWW-Query: an on-line retrieval system for biological sequence banks. Biochimie 78:364–369.

    Pujadas, G., and J. Palau. 1999. TIM barrel fold: structural, functional and evolutionary characteristics in natural and designed molecules. Biologia Bratislava 54:231–254.

    Pujadas, G., F. M. RamÍrez, R. Valero, and J. Palau. 1996. Evolution of ß-amylase: patterns of variation and conservation in subfamily sequences in relation to parsimony mechanisms. Proteins 25:456–472.

    Qian, M., R. Haser, G. Buisson, E. Duee, and F. Payan. 1994. The active center of a mammalian {alpha}-amylase. Structure of the complex of a pancreatic {alpha}-amylase with a carbohydrate inhibitor refined to 2.2 Å resolution. Biochemistry 33:6284–6294.

    Qian, M., S. Spinelli, H. Driguez, and F. Payan. 1997. Structure of a pancreatic {alpha}-amylase bound to a substrate analogue at 2.03 Å resolution. Protein Sci. 6:2285–2296.[Abstract/Free Full Text]

    Ramasubbu, N., V. Paloth, Y. G. Luo, G. D. Brayer, and M. J. Levine. 1996. Structure of human salivary {alpha}-amylase at 1.6 Å resolution: implications for its role in the oral cavity. Acta Crystallogr. D 52:435–446.

    Rydberg, E. H., G. Sidhu, H. C. Vo et al. (11 co-authors). 1999. Cloning, mutagenesis, and structural analysis of human pancreatic {alpha}-amylase expressed in Pichia pastoris. Protein Sci. 8:635–643.

    Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425.[Abstract]

    Sayle, R., and E. J. Milner-White. 1995. RasMol: biomolecular graphics for all. Trends Biochem. Sci. 20:333–379.[ISI][Medline]

    Schellman, C. 1980. The alpha-L conformation at the ends of helices. Pp. 53–61 in R. Jaenicke, ed. Protein folding. Elsevier/North Holland, New York.

    Schulz, G. E., and R. H. Schirmer. 1979. Principles of protein structure. Springer-Verlag, New York.

    Strobl, S., K. Maskos, M. Betz, G. Wiegand, R. Huber, F. X. Gomis-Ruth, and R. Glockshuber. 1998a. Crystal structure of yellow meal worm {alpha}-amylase at 1.64 Å resolution. J. Mol. Biol. 278:617–628.

    Strobl, S., K. Maskos, G. Wiegand, R. Huber, F. X. Gomis-Ruth, and R. Glockshuber. 1998b. A novel strategy for inhibition of {alpha}-amylases: yellow meal worm {alpha}-amylase in complex with the Ragi bifunctional inhibitor at 2.5 Å resolution. Structure 6:911–921.

    Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 24:4876–4882.

    Vallee, F., A. Kadziola, Y. Bourne, M. Juy, K. W. Rodenburg, B. Svensson, and R. Haser. 1998. Barley {alpha}-amylase bound to its endogenous protein inhibitor BASI: crystal structure of the complex at 1.9 Å resolution. Structure 6:649–659.

    Vihinen, M., and P. Mäntsälä. 1989. Microbial amylolytic enzymes. Crit. Rev. Biochem. Mol. Biol. 24:329–418.[ISI][Medline]

    Watanabe, K., Y. Hata, H. Kizaki, Y. Katsube, and Y. Suzuki. 1997. The refined crystal structure of Bacillus cereus oligo-1,6-glucosidase at 2.0 Å resolution: structural characterization of proline-substitution sites for protein thermostabilization. J. Mol. Biol. 269:142–153.[ISI][Medline]

    Wiegand, G., O. Epp, and R. Huber. 1995. The crystal structure of porcine pancreatic {alpha}-amylase in complex with the microbial inhibitor Tendamistat. J. Mol. Biol. 247:99–110.[ISI][Medline]

    Wilbur, W. J., and D. J. Lipman. 1983. Rapid similarity searches of nucleic acid and protein data banks. Proc. Natl. Acad. Sci. USA 80:726–730.

Accepted for publication September 7, 2000.