1 Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, 52900, Israel
E-mail: ron{at}biocom1.ls.biu.ac.il
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: circular permutations/database searches/domain fusion/edit distance
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Since proteins subjected to artificial circular permutation usually maintain their function, it is possible that circular permutation may serve as a constructive mechanism in evolution. Indeed, there are some cases in which such permutations were suggested to have occurred naturally. Several pairs of natural proteins have been described, which appear to be related by circular permutation. Some examples include lectins (Cunningham et al., 1979; Hemperly and Cunningham, 1983
), saposin (Ponting and Russell, 1995
) and ß-glucanase (Heinemann and Hahn, 1995b
). A comprehensive review was published by Lindqvist and Schneider (Lindqvist and Schneider, 1997
).
There are two possible mechanisms that can give rise to pairs of proteins related by circular permutation. In the first, a `parent' gene may undergo a direct local genetic manipulation of its DNA sequence to form a circularly permuted mutant. It has been suggested that a possible mechanism for this might be a duplication of a gene followed by deletion of both termini leaving a permutated gene in the middle. The other alternative is that proteins that are related by circular permutation were formed by fusion of two smaller components to form a larger unit. This process could occasionally occur independently at least twice in evolution, each time in a different order.
The major distinction between laboratory experiments of circular permutations and naturally occurring examples is that in laboratory experiments the two circularly permuted forms of the protein are identical, other than the permutation. In nature, it is unlikely that a pair of proteins would be exact circular permutations of each other. A more reasonable evolutionary scenario would be one in which a gene is mutated into another protein by a permutation or fusion event; thereafter both proteins would further diverge by the standard evolutionary events of insertions, deletions and substitutions.
Thus, in the context of natural sequences we define two sequences S1 and S2 to be related by a circular permutation if S1 = xy and S2= y'x', where the fragment x is similar to x' and the fragment y is similar to y' under some sequence similarity measure. Note that an effort to search for circular permutations in a systematic manner requires a precise definition of what is considered to be a circular permutation and what is not. We suggest a stringent definition for circular permutation (see below), that enforces the requirement that the N-terminus of one protein be similar to the C-terminus of the other and vice versa and that the permuted regions will cover a significant portion of both proteins (see the scheme in Figure 2A). This sequence-based definition enabled us to design a fast screening algorithm for such cases (Uliel et al., 1999
), which is used here as a first step in a systematic search for examples of circular permutations in the entire Swissprot database. Note that our definition will exclude cases where the circular permutation is within a partial region of the protein (Figure 2B
). These cases might be confusing and such examples have sometimes been described in the literature as `circular permutations' (see Results). However, we strongly argue that they are not circular permutations of proteins and that they should be analyzed as `swaps' which would require other computational tools.
|
In this study, we used computational algorithms that enabled us to screen the entire database of known proteins to identify novel examples of circular permutations, to assess how commonly this phenomenon occurs and to gather evidence that might help to distinguish between the two possible mechanisms for the evolutionary generation of circularly permuted protein pairs.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
It was noted (Lindqvist and Schneider, 1997; Russell and Ponting, 1998
) that the standard tools of sequence comparison are not suitable for the detection of circular permutations: regular global alignment algorithms [e.g. the NeedlemanWunsch algorithm (Needleman and Wunsch, 1970
)] are sequential in nature and fail to align sequences which contain sequence permutations. In some cases, the relatedness between the proteins would still be detected based on one common domain alone, but their specific relationship by circular permutation will remain undetected. Furthermore, in cases where a protein has undergone a circular permutation early in its evolution and both proteins accumulated a significant number of other genetic mutations later, it is possible that their alignment will not reveal any detectable similarity. However, if the original cyclic permutation can be identified and `corrected', it is conceivable that sufficient residual similarity exists, which may be identified.
Another alternative is using database search methods such as Blast (Altschul et al., 1990) to search for circular permutations. These methods identify sequences that share short common fragments and report these best local matches. In principle, the list of significant local matches can be analyzed to check for evidence of possible circular permutations. In practice, we found that it is difficult to automatically screen Blast output for circular permutations. This is mainly due to the fact that Blast [and even the gapped-Blast variant (Altschul et al., 1997
)] tends to break alignments into smaller fragments, thus making an unambiguous reconstruction problematic. An additional serious problem is that sequences that contain repeats and duplications are difficult to differentiate from true examples of circular permutations. Another problem is that in cases where the sequence similarity is low, two fragments with local similarity that, if taken together, could have been indicative of a circular permutation might be below the Blast detection level.
While standard automatic methods are not suitable for the detection of circular permutations, visual inspection tools such as dot matrix plots allow the detection of circular permutations, but in a slow manual procedure. Dot matrix plots (Maizel and Lenk, 1981) (also known as dotplot) are a simple, yet effective, sequence comparison tool utilizing manual visualization to identify relationships (such as similarity, repeats and self-complementarity) between sequences (Unger et al., 1986). The existence of pairs of diagonal lines that originate from different rows and columns and are off the main diagonal is a characteristic feature of circular permutation (see Figure 3
).
|
We have previously described the development and testing of an algorithm that allows the automated screening of protein sequence data for the existence of circularly permuted sequence pairs (Uliel et al., 1999). The algorithm is briefly reviewed here, together with a description of how it was implemented for the screening of the entire Swissprot database: pairs of proteins that passed this screening test were subjected to more careful and computationally demanding algorithms. Pairs that passed the second test were subjected to manual analysis to validate their status.
The exact algorithm
We start with an overview of the straightforward exact algorithm to identify permuted proteins, since it provides the basis for understanding the much faster approximation variant that we have designed. The global edit distance between two sequences [using the NeedlemanWunsch algorithm (Needleman and Wunsch, 1970)] measures the number of genetic operations of insertion, deletion and substitution needed to change one sequence into the other. To detect a possible circular permutation between a pair of sequences, one can check if there exists a circular permutation of one of them that minimizes their edit distance. Thus, we defined the circular edit distance to be CED(S1,S2) = MINi{ED[S1,Permi(S2)]}, where ED(S1,S2) is the standard edit distance between two strings and Permi(S) is an exact circular permutation of i characters from the suffix of S to the prefix of S [e.g. Perm3(AAAGCTG) = CTGAAAG] and the minimum is over all possible circular permutations.
This representation of the question lends itself to the following exact algorithm: (a) for each pair of proteins create all possible circular permutations of one protein relative to another; (b) for each permutation, calculate the regular edit distance by dynamic programming; and then (c) choose the lowest edit distance value. Furthermore, the set of all circular edit distances ED[S1,Permi(S2)] between two sequences can be used as a statistical ensemble to evaluate the significance of the value of the minimum, CED(S1,S2). Thus it provides a way to estimate whether a certain value of CED is statistically significant.
This algorithm requires a time complexity of N3, where N is the length of the proteins (N iterations of the standard N2 edit distance algorithm). Recently, a theoretical algorithm was suggested (Landau et al., 1998) that solves the problem in asymptotic quadratic time. However, the pre-processing required for each pair of compared proteins and the large constant factors involved in running this algorithm make it impractical for comparison of a large number of pairs of proteins.
A time requirement of N3 for a comparison between a pair of proteins is computationally expensive. It becomes prohibitive for a full pairwise comparison of the entire database. In our implementation, a single N2 comparison of a pair of sequences each 300 amino acids in length takes about 0.005 s (using a Silicon Graphics R10000 processor). An exhaustive search of all possible cyclic permutations for a single protein pair should therefore require about 1.5 s (0.005x300). Thus, a complete survey of all pairs from the current protein database, which includes about 80 000 proteins, would take many CPU years.
A fast screening algorithm
As an alternative, we designed an efficient algorithm that, although not guaranteed to find the optimal permutation, does perform very well in practice. The details of the algorithm and a demonstration of its effectiveness in a designed test case were described previously (Uliel et al., 1999). Here, we review the main premise of the algorithm.
The fast algorithm is a variation of the edit distance algorithm in which one sequence is compared with a duplication of the other sequence. Namely, for sequences S1 and S2, S1 is compared with S2S2 (see Figure 4). The principle is that in this way, all consecutive permutations, each consisting of a C-terminal fragment of S2 followed by an N-terminal fragment of S2, are simultaneously compared with S1.
|
|
Passing this approximate algorithm is not sufficient to establish a case of circular permutation. Even the exact algorithm (i.e. calculating the edit distance for each possible permutation) does not prove that a pair of proteins is related by circular permutation. Two of the obvious problems are that (1) it is not always clear how to define `a significant minimum' in the series of edit distances produced by the exact algorithm and (2) the internal structure of a protein (e.g. repeats or low-complexity regions) may blur the significance of the results. Thus, as is common in biological applications of string matching algorithms, the algorithmic part can only identify potential candidates that need to be further examined.
The combined procedure
The screening algorithm was used as the `main engine' of a large-scale survey of circular permutations in the protein database. We started with all proteins from SwissPROT version 34.0. Since, by our definition, proteins of very different length are not candidates for circular permutation, only pairs of proteins of comparable length were tested. To this end, proteins were divided into 35 length groups: every 10 residues from length 100 to 300 (i.e. we grouped together all proteins of size 100109, then 110119, etc.), every 20 residues from length 300 to 400, every 25 from length 400 to 500 and every 50 from length 500 to 1000. Proteins shorter than 100 or longer than 1000 residues were not considered. On average, each length group contained about 1200 proteins.
Comparisons were made between all proteins that belong to length groups that differ by not more than two lengths, i.e. each protein was compared with all proteins in its own length group and in each of the two longer and shorter groups. For example, a protein of length 295 residues was compared with all proteins from size 270 to 340. The comparisons were performed using the approximate algorithm described above with a unit edit distance penalty of one for insertion, deletion and substitution.
For each comparison, we screened the last row for a possible minimum. A possible minimum was defined if the value in a certain cell of the last row of the edit distance matrix was smaller than 95% of the average value of all cells in the last row. A permissive threshold of only 5% below the average was used, since all of candidate pairs were further screened by the exact algorithm. About 300 000 pairs of proteins were identified for advance to the exact algorithm.
With the exact algorithm, a threshold of 90% of the lowest value achieved among all edit distance comparisons (i.e. one comparison for each possible circular permutation) relative to the average value was required. This stage left about 8000 pairs of proteins for further examination. Note that the exact algorithm also identified the position of the putative circular permutation (i.e. the location of the minimum).
As mentioned above, a minimum in the exact algorithm does not necessarily prove that an evolutionary event of circular permutation indeed took place. Thus, for each pair of candidate proteins additional tests were performed.
The first test involved visual inspection of the dotplot matrices, before and after the circular permutation was performed. Short diagonals that are off the main diagonal before the permutation and merge together along the main diagonal after the permutation provide a strong indication of a circular permutation (see Figure 3). In particular, a dotplot is a good tool to eliminate cases where the original proteins include internal repeats that can result in a false positive in the automated analysis. Next, the results of the standard sequence alignment procedure performed between the pairs of proteins before and after the permutation were compared. This was done using the GAP procedure of the GCG package with standard parameters. Naturally, the option of penalizing end gaps similarly to regular gaps was used to avoid the possibility of ignoring the terminal regions, which are important for detection of circular permutations.
The GAP procedure can calculate the Z-score of the significance of the alignment. This is done by comparing the quality (in terms of edit distance score) of the alignment with the quality of a large number of random alignments produced by a random shuffle of one of the sequences. Thus, we could compare the Z-score of the original alignment of the two proteins with the Z-score achieved after one of the proteins was circularly permutated. A large improvement in the Z-score of the alignment after circular permutation versus the Z-score of the alignment without circular permutation suggests that it is more likely that the two proteins are related by circular permutation than by regular alignment. We used a stringent cut-off of Z-score improvement after the circular permutation, by at least five standard deviation (SD) units. We validated that such an improvement in the Z-score is not observed when a large number of (non-permutated) sequences were subject to random circular permutations.
Pfam screening
As an independent test of the sensitivity of our screening procedure, we also employed a separate scheme based on a different definition of circularly permuted proteins. Thus, if our algorithm did not detect a large number of circularly permuted protein pairs, we would expect these to be found by the second, independent screen. For this procedure, we used a totally different approach to detect circular permutations by finding pairs of proteins that contain the same recognized domains, where these domains appear in the two proteins in a circularly permuted order relative to each other. We used the Pfam database, version 6.0 (Bateman et al., 2000), which is based on multiple alignments of protein domains or conserved protein regions. Pfam supplies (in a file called SWISSPFAM) the domain structure of 71 415 proteins from the Swissprot database, according to Pfam definitions. We define two proteins to have a domain circular permutation if they contain the same domains, but these domains appear in one protein in an order that is a circular permutation of their order in the other protein. A direct comparison of each pair of Swissprot proteins to detect whether they contain circular permutations of domains will require N2 operations.
To speed up the process, we designed the following procedure: for each protein in Swissprot, we stored in one line a list of its domains sorted to a canonical lexicographical order by their names. We also stored for each protein its distance from this order, i.e. the weighted number of swap operations needed to obtain to the canonical order from the original order. We then sorted these lines to find all proteins that share the same domains, which will appear as consecutive lines. From these lines we selected only pairs that have a different distance to the canonical order, indicating that they have a different original order. These protein pairs were then manually evaluated to determine whether their domains are ordered as a circular permutation of each other. Since this procedure is based on sorting, it has a run time of NlogN, where N is the number of proteins, rather than the naive N2 comparisons.
Note that this domain-based definition of circular permutation avoids some of the restrictions imposed by our sequence-based definition. For example, the proteins do not have to be of similar sizes for a circular permutation to be considered. It suffices, for example, that each protein contains, in a different order, the same two recognized domains, but these domains do not have to cover the entire sequences. The domain-based definition excludes cases in which the permutation is within a partial region of the proteins (Figure 2B), which we do not consider as a circular permutation. Naturally, this definition depends on domains that are pre-defined in the Pfam database and thus it is less general than the direct sequence-based definition. Thus, in a sense, these two alternative definitions can be considered complementary.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
Another interesting case is the circular permutation between GUN2_THEFU and GUN2_THEF, two endoglucanase proteins. Each of these proteins contains two known motifs: a glycosyl hydrolase catalytic domain and a cellulose binding domain, which utilizes two tryptophan residues to facilitate cellulose binding. The relative order of these two domains is different in the two bacterial species, as can be seen in the dotplot presented in Figure 3C. This seems to suggest that the binding activity and the catalytic activity are spatially distinct and that their precise orientation is not crucial for the activity of the protein. These proteins are similar to the ß-glucanase proteins which appears in the list of circular permutations compiled by Lindqvist and Schneider (Lindqvist and Schneider, 1997
).
Our data also identify several examples that were not previously noted: one example is a circular permutation between PABS_STRGR (p-aminobenzoate synthase) from Streptomyces griseus and TRPE_RHIME (anthranilate synthase) from Rhizobium meliloti (Figure 6C). These two proteins are involved in related enzymatic activity. PABS_STRGR catalyzes the synthesis of p-aminobenzoate from chorismate and glutamine and TRPE_RHIME catalyzes the synthesis of anthranilate (o-aminobenzoate) from the same starting materials. The activity of the N-terminal domain of PABS_STRGR involves the removal of the ammonia group from a glutamine molecule and its subsequent transfer to a specific substrate. The chorismate binding is performed by the C-terminal domain. In TRPE_RHIME, these domains appear in the reverse order. It is interesting that in other organisms (e.g. Methanococcus jannaschii and Thermus aquaticus) these two domains are not part of the same gene. Rather, they are translated as two separate polypeptides which associate in vivo as subumits of a larger complex. Surprisingly, although the two domains are on different genes in Thermotoga maritima, the coding regions of these two genes overlap by more than 100 nucleotides.
An additional example is the circular permutation between two H-1 histone proteins, H1B_PLADU from the worm Platynereis dumerilii and the bovine H11_BOVIN. The linker histone H-1 links nucleosomes into a higher order chromatin structure. A 3D NMR structure for the globular domain is available (1GHC in PDB) (Cerf et al., 1994) and was shown to be similar to other DNA binding domains. In H1B_PLADU, the N-terminal domain (residues 875) contains the globular domain (called linker_histone in the Pfam classification) and the C-terminal (residues 76119) is lysine-rich. In H11_BOVINE, the N-terminus contains the lysine-rich domain, followed by the globular linker-histone domain. The organization found in the bovine protein is common in almost all organisms, but forms similar to H1B_PLADU are found in various sea urchins. It seems that the lysine-rich domain, which provides the general DNA binding function, is not required to be in a specific location in the protein. The results of this alignment are shown in Figure 6D
.
Another case is the circular permutation between LPSZ_RHIME and LIPA_NEIME, two proteins that are involved in polysaccharide processing. LPSZ_RHIME is a cytoplasmic protein which is involved in the invasion of nitrogen fixation nodules and may be involved in the biosynthesis of lipopolysaccharides. LIPA_NEIME, which is probably an inner membrane protein, is involved in the phospholipid modification of the capsular polysaccharide, which is a requirement for its translocation to the cell surface. The dotplot comparing these two proteins, shown in Figure 3B, strongly suggests a circular permutation. Although the role of the two permutated segements is not clear, their ability to interchange may suggest again a two-domain structure, a binding domain and a catalytic domain.
A main goal of this study was to prepare a conclusive list of all pairs of known proteins that might be related by circular permutation. Hence we needed to determine whether our algorithm was sufficiently sensitive to detect all permuted proteins. To this end, we compared our systematic results with the compilation of such examples that have been described in the literature. The comprehensive review by Lindqvist and Schneider (Lindqvist and Schneider, 1997) was used as the source of these examples.
(a)The case of lectins was found by our method (Table I, LEC_BOWMI versus LECA_DIOGR). Note that this is the special case in which the circular permutation occurs as a post-translational modification on the protein level.
(b)The case of bacterial ß-glucanase was identified (Table I, GUN2_THEFU versus GUNA_CELFI).
(c)The case of swaposin: this case does not fit our definition of circular permutation. First, the two proteins involved are of very different sizes (SAP_BOVIN is of length 80 amino acids and ASPR_CUCPE is of size 513 amino acids!); hence our procedure did not compare them. Furthermore, this is not a case of a full circular permutation where the N- and C-terminal regions are swapped. As discussed by Lindqvist and Schneider, in this protein pair the permutation occurs within a domain of the larger protein (Lindqvist and Schneider, 1997).
(d)The case of -amylase and
-1,3-glucan-synthesizing glucosyltransferase: again because of the large size discrepency between the two proteins (AMY2_ECOLI of size 495 residues and GTFD_STRMU is of size 1462), this case was not considered. Here, too, the circular permutation is within a domain of the longer protein.
(e)ß-Glucosidase: this clear-cut case of circular permutation was detected by our procedure (Table I, BGLS_BUTFI versus BGLB_CLOTM)
(f)Transaldolase: a strong case for circular permutation between transaldolase B and aldolase has been made (Jia et al., 1996). However, their argument was based on the superposition of a small number of key residues based on structural alignment. As indicated by Lindqvist and Schneider (Lindqvist and Schneider, 1997
) and shown by our dotplot (Figure 7A
), the overall sequence similarity is rather weak and indeed our procedure did not detect a circular permutation in this case.
|
(h)C2 domain: this case also did not pass our criteria, (e.g. in comparing KPC2_DROME with KPCL_MOUSE) because the permutation here is within the N-terminal region and cannot be considered as a circular permutation of the entire protein.
Pfam screening
We screened Swissprot for all pairs of proteins that have a circular permutation of domains as defined by the Pfam database (Bateman et al., 2000). We compared all pairs of proteins that contain two to five domains. We found circular permutations mainly for proteins that include two domains. The only exception is the case of the ß-glucosidase proteins (BGLS_BUTFI and BGLB_CLOTM) that by the Pfam classification contain three domains. We did not find any other case of proteins that have three or more domains that are related by circular permutation. Ten cases of proteins that contain two domains in a different order were found. Of these 10 examples, three cases were found by our sequence-based procedure (GUN2_THEFU and GUNA_CELFI; CIN_DROME and CNX1_ARATH; HK25_XENLA and HXB5_BRARE); seven are novel cases which are shown in Table IV
. Inspection of these cases revealed that they were not detected by our sequence-based algorithm owing to a large difference in protein sizes (the first four cases in Table IV
) or because the permutated domains covered only relatively small segments of the protein lengths and thus the improvement in Z-score was not significant enough (the last three cases in Table IV
).
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The list presented here adds several examples of possible circular permutation to those which were previously identified. About half of the instances are of longer proteins (12 of the 25 cases are longer than 300 residues). This preference for larger proteins suggests that independent fusion events of functional units in different orders are the driving mechanism, rather than direct circular permutation of a `parent' gene to form a mutant. This and the small overall number of cases that we could find suggests that there is no direct genetic mechanism by which circular mutation can frequently occur.
The overall number of circularly permuted pairs is small, based on our analysis of protein sequences as appear in the Swissprot database. It must be noted that most of the sequence information in Swissprot is based on cDNA data and not on direct protein sequences. Thus, our procedure mainly compared putative sequences as translated from mRNA sequences and not actual protein sequences. The example of lectins shows that it is possible to generate circular permutations directly on the protein level by cleavage and ligation. Clearly, the frequency of circular permutations on the protein level, as well as other post-translational editing events, can be explored only when proteomics projects will provide sufficient direct data of protein sequences.
In the first part of our study we used a strict sequence-based definition of circular permutations. To confirm our results, we added a domain-based definition. Although we believe that these definitions taken together are inclusive, we must note that they do not cover other types of permutations that could occur, such as circular permutations within domains that do not span the whole protein (e.g. Figure 2B and C). These and other types of domain shuffling (see, for example, Hopfner et al., 1998
) which might be more common in protein evolution require different tools for analysis and are part of a parallel study which is currently under way.
![]() |
Notes |
---|
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Altschul,S.F, Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 33893402.
Ay,J., Hahn,M., Decanniere,K., Piotukh,K., Borriss,R. and Heinemann,U. (1998) Proteins, 30, 155167.[ISI][Medline]
Bateman,A., Birney,E., Durbin,R., Eddy,S.R., Howe,K.L. and Sonnhammer, E.L.L (2000) Nucleic Acids Res., 28, 263266.
Carrington,D.M., Auffret,A. and Hanke,D.E. (1985) Nature, 313, 6467.[ISI][Medline]
Cerf,C., Lippens,G., Ramakrishnan,V., Muyldermans,S., Segers,A., Wyns,L., Wodak,S.J. and Hallenga,K. (1994) Biochemistry, 33, 1107911086.[ISI][Medline]
Cunningham,B.A., Hemperley,J.J., Hopp,T.P. and Edelman,G.M. (1979) Proc. Natl Acad. Sci. USA, 76, 32183222.[Abstract]
Duret,L., Gasteiger,E. and Perriere,G. (1996) Comput. Appl. Biosci., 12, 507510.[Abstract]
Goldenberg,D.P. and Creighton,T.E. (1983) J. Mol. Biol., 165, 407413.[ISI][Medline]
Heinemann,U. and Hahn,M. (1995a) Prog. Biophys. Mol. Biol., 64, 121143.[ISI][Medline]
Heinemann,U. and Hahn,M. (1995b) Trends Biochem. Sci., 20, 349350.[ISI][Medline]
Hemperly,J.J. and Cunningham,B.A. (1983) Trends Biochem. Sci., 8, 100102.[ISI]
Hennecke,J., Sebbel,P. and Glockshuber,R. (1999) J. Mol. Biol., 286, 11971215.[ISI][Medline]
Hopfner,K.P., Kopetzki,E., Kresse,G.B., Bode,W., Huber,R. and Engh,R.A. (1998) Proc. Natl Acad. Sci. USA, 95, 98139818.
Iwakura,M., Nakamura,T., Yamane,C. and Maki,K. (2000) Nature Struct. Biol., 7, 580585.[ISI][Medline]
Jia,J., Huang,W., Schorken,U., Sahm,H., Sprenger,G.A., Lindqvist,Y. and Schneider, G. (1996) Structure, 4, 715724.[ISI][Medline]
Landau,G.M., Myers,E.M. and Schmidt,J.P. (1998) SIAM J. Comput., 27, 557582.
Lindqvist,Y. and Schneider,G. (1997) Curr. Opin. Struct. Biol., 7, 422427.[ISI][Medline]
Luger,K., Hommel,U., Herold,M., Hofsteenge,J. and Kirschner,K. (1989) Science, 243, 206210.[ISI][Medline]
Lupas,A., Engelhardt,H., Peters,J., Santarius,U., Volker,S. and Baumeister,W. (1994) J. Bacteriol., 176, 12241233.[Abstract]
Maizel,J.V.,Jr. and Lenk,R.P. (1981) Proc. Natl Acad. Sci. USA, 78, 76657669.[Abstract]
Needleman,S.B. and Wunsch,C.D. (1970) J. Mol. Biol., 48, 443453.[ISI][Medline]
Ponting,C.P. and Russell,R.B. (1995) Trends. Biochem. Sci., 20, 179180.[ISI][Medline]
Russell,R.B. and Ponting,C.P. (1998) Curr. Opin. Struct. Biol., 8, 364371.[ISI][Medline]
Uliel,S., Fliess,A., Amir,A. and Unger,R. (1999) Bioinformatics, 15, 930936.
Unger,R., Harel,D. and Sussman,J.L. (1986) CABIOS, 2, 283289.[Abstract]
Received August 15, 2000; revised March 7, 2001; accepted May 11, 2001.