Structure and function prediction of the Brucella abortus P39 protein by comparative modeling with marginal sequence similarities

K. de Fays1, A. Tibor, C. Lambert, C. Vinals, P. Denoël, X. De Bolle, J. Wouters, J.-J. Letesson and E. Depiereux

Unité de Recherche en Biologie Moléculaire, Facultés UniversitairesNotre-Dame de la Paix, B-5000 Namur, Belgium


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
A methodology is proposed to solve a difficult modeling problem related to the recently sequenced P39 protein. This sequence shares no similarity with any known 3D structure, but a fold is proposed by several threading tools. The difficulty in aligning the target sequence on one of the proposed template structures is overcome by combining the results of several available prediction methods and by refining a rational consensus between them. In silico validation of the obtained model and a preliminary cross-check with experimental features allow us to state that this borderline prediction is at least reasonable. This model raises relevant hypotheses on the main structural features of the protein and allows the design of site-directed mutations. Knowing the genetic context of the P39 reading frame, we are now able to suggest a function for the P39 protein: it would act as a periplasmic substrate-binding protein.

Keywords: consensus/fold recognition/periplasmic sugar-binding proteins/protein modeling/secondary structure prediction


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
The facultative intracellular pathogen Brucella abortus elicits a cellular immune response in addition to an antibody response in bovine species. In a study to identify specific antigens that can be useful to discriminate between infected and vaccinated animals, a T-cell immunodominant antigen of 39 kDa (P39) has been identified (Denoël et al., 1997Go). The gene encoding the P39 protein was cloned and sequenced, but until now its deduced amino-acid sequence did not allow the prediction of its functional role. As knowledge of the three-dimensional (3D) features of P39 could be useful to help in determining functional features, possible interactions with other compounds and phenotypic effects of mutations, 3D prediction steps have been attempted. In the case of P39, no sequence homology with a protein of known 3D structure was detected. This excludes straightforward homology modeling procedures available nowadays (Guex and Peitsch, 1997Go) by lack of a proper template.

However, knowledge-based modeling, also called comparative model building, can still be used when only low sequence homologies (below 25%) exist between a target sequence and protein structures (Bajorath et al., 1993Go; Vinals et al., 1995Go; Tramontano, 1998Go; Wouters and Baudoux, 1998Go). Such modeling techniques start from the premise that adequate known 3D structures can be used as templates to model proteins of unknown structure, even if structural similarities are not detectable in terms of sequence. Identification of template candidates can be achieved using a number of new methods developed in the last few years, in particular efficient secondary structure prediction techniques (King and Sternberg, 1996Go; Rost, 1996Go) and fold recognition tools (threading) (Jones et al., 1992Go; Sippl and Weitckus, 1992Go; Rice et al., 1997Go).

Correspondences between target and template residues, established from the results of the threading programs and using predicted secondary structure as guides, provide structural information for the construction of the target fold in a similar way to multiple sequence alignment for homology modeling (Aszodi et al., 1997Go).

In the present work, we propose to use a combination of sequence alignments, consensus of secondary structure predictions and fold recognition tools to identify a reasonable template and to increase the accuracy of the modeling process of P39.

It appears that complementary results are consistent with one another and that the model, although rough, allows us to make relevant hypotheses on the main structural features of the protein and to select potential target residues for site-specific mutagenesis studies. Considering the genetic context of the gene encoding P39, the model suggests a function for the protein.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
The target is the deduced amino acid sequence of B.abortus P39 gene product, described by Denoël et al. (1997). Threading calculations, model building, displays and evaluations were executed on a Silicon Graphics Indigo workstation running IRIX5.3. The other results were obtained from World-Wide-Web servers. A flow chart of steps followed during the modeling procedure is depicted schematically in Figure 1Go.



View larger version (61K):
[in this window]
[in a new window]
 
Fig. 1. Outline of the general strategy in the modeling procedure of P39.

 
Sequence homology search

Sequence analysis was performed using the P39 sequence as the query with homology search tools BLAST (Altschul et al., 1990Go), BLAST2 (or Gapped-BLAST) (Altschul et al., 1997Go) and FASTA (Pearson and Lipman, 1988Go; Pearson, 1990Go) in the following databases: non-redundant GenBank CDS translations, PDB, SwissProt, SP-update and PIR. The P39 sequence was also compared, using BLAST and FASTA, with the Brookhaven Database (PDB) only, to detect potentially weak homologies with proteins of known structure.

In every case default parameters were used. Each tool provided a measure of the statistical significance of the alignment between the query sequence and each matching sequence.

Finally, we used BLOCKS server (Henikoff and Henikoff, 1994Go) to find structurally or functionally conserved stretches of residues and SignalP V1.1 server (Nielsen et al., 1997Go) to detect signal peptide and cleavage sites.

Multiple alignment

Multiple alignments of P39 and sequences of interest were executed using ClustalW (Thompson et al., 1994Go) and MATCH-BOX (Depiereux et al., 1997Go).

Secondary structure prediction

The target sequence was used as input to several web servers for secondary structure prediction, including PHD (Rost and Sander, 1993Go, 1994Go; Rost et al., 1994Go), DSC (King and Sternberg, 1996Go), PREDATOR (Frishman and Argos, 1997Go), SSP (Solovyev and Salamov, 1994Go), NNPredict (McClelland and Rumelhart, 1988Go; Kneller et al., 1990Go) and the IBCP-Web server, based on the Gibrat (Gibrat et al., 1987Go), Levin (Levin et al., 1986Go), DPM (Deleage and Roux, 1987Go) and SOPMA (Geourjon and Deleage, 1995Go) methods. PHDhtm (Rost et al., 1995Go, 1996Go) was also used for the prediction of putative transmembrane helices.

Fold recognition

Fold recognition experiments were performed to detect similarities between protein 3D structure in spite of the lack of any statistically significant sequence similarity. For this we used the following programs: ProFIT 2.0 (Sippl and Weitckus, 1992Go), THREADER 2 (Jones et al., 1992Go), UCLA–DOE (Fischer and Eisenberg, 1996Go) and Topits (Rost, 1995Go). We used standard libraries provided by the authors and kept all program command line options at default. For each method, results are given as a list of possible fold candidates in decreasing order of probability, where expected structural matches are ranked at the top of the list (highest Z-scores and lowest energy). Similarities of the various candidate folds were analyzed using the SCOP classification (Murzin et al., 1995Go). The 3D coordinates of the best hits were extracted from PDB and primary sequences were retrieved from SwissProt (Bairoch and Apweiler, 1997Go) or from FSSP (Holm and Sander, 1998Go) databases.

Sequence-structure alignment

A consensus alignment was achieved manually to obtain the most reliable alignment between the target sequence and the template structure. This was obtained by combining (i) a structural alignment (from the FSSP database) of four homologous structures, (ii) the alignments generated by fold recognition methods and (iii) three alignments based on sequence similarity [ClustalW, Matchbox and Align (Myers and Miller, 1988Go)]. This consensus was optimized using the consensus of predicted secondary structure and information from the 3D structure of the template.

Modeling

The target–template (1D–3D) alignment was edited with the HOMOLOGY module of MSI (San Diego, CA), then submitted to the program MODELLER4 (Sali and Blundell, 1993Go) to obtain the 3D model of the target. Graphical displays were generated with the INSIGHTII molecular modeling system of MSI (Molecular Simulations, 1996Go). The resulting model was checked with PROSAII (Sippl, 1993Go) and PROCHECK (Laskowski et al., 1993Go).


    Results and discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Sequence analysis

Only two sequences were found to be significantly similar to P39 by each of the three homology search tools (BLAST, BLAST2 and FASTA). The first hit is a partial sequence of an excreted protein of unknown function from Leptothrix discophora, excA (Corstjens, 1993Go); the second is the precursor of a periplasmic multiple sugar-binding protein from Streptococcus mutans, MsmE (Russell et al., 1992Go). Among other less significant matches from FASTA we found three additional periplasmic binding proteins: a putative periplasmic maltose-binding protein precursor from Thermotoga maritima (Liebl et al., 1997Go), a glycerol-3-phosphate-binding periplasmic protein precursor from Escherichia coli (Overduin et al., 1988Go) and a putative maltose-binding protein from Streptomyces coelicolor (van Wezel et al., 1997Go). Homology search against the PDB database highlighted three maltose/maltodextrin-binding protein structures, but without any statistically significant sequence homology [P(N) > 0.99].

Multiple alignments performed on these sequences and P39 did not allow us to detect clearly conserved regions. However, searches in the BLOCKS database revealed that P39 contains two of the eight signatures described for maltose-binding proteins (Accession No. PR00181): residues 127–146 aligned with D signature and residues 334–353 aligned with H signature.

Finally, analysis of the predicted amino acid sequence with the SignalP web server revealed no typical N-terminal signal sequence.

Prediction of P39 2D and 3D structures

As no strong evidence was obtained by similarity searches, the logical next step was to predict secondary and tertiary structures.

The secondary structure predictions help in refining the alignment of the target sequence with the fold candidate obtained by threading. The positioning of helices, which are systematically predicted with higher accuracy, is especially useful. To improve the reliability of the predictions, a rational consensus between the different outputs obtained was calculated manually (Figure 2Go) according to a scoring pattern detailed hereafter. First, a score was assigned to each position of each prediction, according to the confidence level (c.l.) provided by the method, when available, and to the predicted secondary structure type, the score being positive for helices, H, and negative for strands, E (Table IGo).



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 2. Consensus of predicted secondary structures determined as described in Materials and methods. The scoring scheme is detailed in Table IGo. {alpha}-Helices are depicted in pale gray and ß-strands are in dark gray.

 

View this table:
[in this window]
[in a new window]
 
Table I. Scoring scheme of the secondary structure predictions
 
A sum-score was obtained for each residue by addition of individual scores given for all the methods. Finally {alpha}-helix was predicted for a sum-score >=6 (the threshold usually observed at the edges of helices) and ß-strands where the sum-score = –4 (the threshold is relaxed because strands are generally underestimated). The consensus obtained indicates 17 helices (36.6% of the total length) and eight ß-strands (6.8%).

Five fold recognition programs were also used and the results were cross-correlated to extract the most reliable candidate. Four methods out of the five placed the same protein at the top of the list (Table IIGo), with a high confidence level (except for Topits, which gave a confidence level of 33%). This protein is MalE, a maltose/maltodextrin-binding protein of E.coli [periplasmic component family of the binding protein-dependent (BPD) transporters], that was crystallized either in a complex with or without ß-cyclodextrin (1dmb and 1omp structures, respectively; Sharff et al., 1992Go, 1993Go). Its total length (398 amino acids; 370 in its mature form) is similar to P39 (383 amino acids) and both sequences share between 14 and 18% identity depending on the alignment method used. Its secondary structure ratio (39.7% of {alpha}-helical residues, 15.4% strand residues) is in the range of the P39's predictions. Two other structures (1pot and 1sbp) belonging to the same family as MalE were found within the 10 best fold candidates proposed by ProFIT (data not shown).


View this table:
[in this window]
[in a new window]
 
Table II. Top scores obtained by fold recognition programs with P39 as query
 
All these results strongly suggest that one of the two MalE structures (1omp or 1dmb) is the best template candidate for modeling P39. At this stage, not knowing of any advantage of one form over the other, we decided to continue with the 1omp fold, because the latter is the native unliganded form of MalE.

1D–3D alignment

A consensus alignment between 1omp and P39 was obtained (as described in Materials and methods) by combining an FSSP-structure alignment of 1dmb, 1omp, 1pot and 1sbp, with sequence–structure alignments given by threading programs and sequence alignments between P39 and MalE. Interestingly, all of the programs aligned residues 127–146 of P39 with the maltose-binding protein signature D (residues 125–144 of 1omp), as predicted by BLOCKS. This signature was thus used as an anchor point for starting the manual fitting of the consensus in order to optimize alignment of secondary structures of P39 and 1omp. This step minimizes gaps and confines their position into loops. However, secondary structure prediction was not relevant for two regions of P39 (regions 45–125 and 155–190, on both sides of residues predicted to be similar to signatures D) because secondary structure prediction did not correspond to 1omp secondary structures. However, we could find in these regions sequence similarities with two other maltose-binding protein signatures (Figure 3Go): residues 48–54 sharing 28,6% identity in a seven amino acid overlap with MalE signature B and residues 101–114 sharing 14.3% identity in a 14 amino acid overlap with MalE signature C. Using these signatures as anchor points, we moved gaps toward regions that are not in an {alpha}-helix or a ß-strand, generally upstream from a proline residue.



View larger version (45K):
[in this window]
[in a new window]
 
Fig. 3. Consensus of the sequence (p39) – structure (1omp) alignment. 1omp and P39 secondary structures are shown ({alpha}-helices are in pale gray and ß-strands in dark gray). MalE signatures A, B, C, D, F and H and P39 predicted signatures D and H are in bold. Boxes correspond to sequence similarities between P39 and MalE signatures.

 
With this final alignment, shown in Figure 3Go, we were able to find two additional signatures, one (residues 5–15) sharing 45.5% identity in an 11 amino acid overlap with MalE signature A and the other (residues 213–228) sharing 25% identity in a 16 amino acid overlap with MalE signature F.

Building and evaluation of the initial model

The coordinates of the 1omp structure were assigned to the P39 sequence according to the consensus alignment using MODELLER4. The resulting model, shown in Figure 4Go, was analyzed without any additional minimization steps. In this case, molecular dynamics was not applied because it is likely to make the prediction worse rather than better.



View larger version (61K):
[in this window]
[in a new window]
 
Fig. 4. Unrefined model of P39, generated with MolScript (Kraulis, 1991Go). The three inter-domain linkers forming the hinge region are indicated by asterisks. The presumed ligand-binding site and the N-terminal extremity are shown by an arrow. Two circles surround regions that need further refining: part of the C-lobe that was deduced from a domain unique to the maltose-binding proteins and the third inter-domain linker.

 
The structural model is mainly constituted of two globular domains (the N-lobe and the C-lobe) separated by a deep cleft. Each domain has a central core of a four-stranded ß-sheet with {alpha}-helices on each side. The hinge region, composed of three peptide segments, connects the two domains.

A preliminary evaluation with PROCHECK and PROSAII shows that most of the current P39 models do not present stereochemical aberrations. With PROCHECK, 95.2% of residues are found in `most favoured' and `additional allowed' regions and only seven residues, generally located in loops, are scored in `disallowed' regions. In the PROSAII energy profile analysis, only six residues (residues 322–326 and G338) have unfavorable positive energies. Residues 322–326 are located in an exposed loop constituting the third inter-domain linker. However, inter-domain linkers are known to be the most different regions in the folding topology of the periplasmic substrate-binding protein structures.

Finally, these observations in conjunction with known structural features of the periplasmic substrate-binding proteins allow us to point out the more reliable regions of the model. These are represented in Figure 5Go, on the 1omp topology. Elements colored in gray on the topology plot are the most confident in our model, as they correspond to the regions of the periplasmic substrate-binding protein structures that have the greatest structural homology. The white elements, corresponding to the extremities of the structure, the third inter-domain linker and part of the C-lobe (which was deduced from a domain unique to the maltose-binding proteins), need to be further refined. These regions are surrounded in Figure 4Go.



View larger version (36K):
[in this window]
[in a new window]
 
Fig. 5. Topology of 1omp. {alpha}-Helices are represented by circles (from I to XVI) and ß-strands by triangles (from B to M). The first ß-strand (A) is not shown as it is not conserved in the P39 structure. Elements colored in gray are the most confident.

 
The precision obtained for the structure is sufficient to delineate two major regions in P39 that are potentially functionally important for transport:
  1. The ligand-binding site region responsible for ligand-binding. With respect to 1omp, this region is located in the deep cleft between the two domains. In 1omp it is heavily populated by polar and aromatic groups, most of them being involved in extensive hydrogen-bonding and Van der Waals interactions with maltose (Spurlino et al., 1991Go). Hence the ligand-binding function of P39 could be attributed mainly to residues F59, D60, Q157 and W230.
  2. A region interacting with membrane transport components, composed in P39 of residues 211–224 corresponding to 1omp residues 207–220 (Spurlino et al., 1991Go).

There is a third functionally important region in 1omp, the hinge region, composed of the three inter-domain linkers, connecting the two domains and helping stabilize both domains precisely in the liganded closed form (Spurlino et al., 1991Go). However, as these segments vary from one substrate-binding protein structure to another, we could not delineate them precisely in P39.

Genetic context of P39 ORF

Previous results highlighted some structural features of the P39 protein, but not its function. To confirm the hypothesis that P39 is a periplasmic substrate-binding protein, we take advantage of the fact that the genes encoding the components of BPD transporters are almost invariably organized in operons to achieve a coordinated regulation of their expression (Boos and Lucht, 1996Go). Consequently, we undertook the DNA sequencing downstream of the P39 ORF (A.Tibor, unpublished data).

The analysis of these sequences reveals the presence of two ORFs that show high homology with sequences belonging to the integral inner membrane components of binding protein-dependent transporters and that contain the highly conserved sequence motif, located near the C-terminus of all proteins of this class and named the EAA loop (Dassa and Hofnung, 1985Go; Boos and Lucht, 1996Go). We also found a fourth ORF that shows homology with the ATP-binding-cassette (ABC) subunits of these transport systems.

These results are in agreement with the hypothesis that P39 protein is the periplasmic substrate-binding component of a BPD transport system and suggest that ORFs 2 and 3 encode integral membrane proteins with permease properties and that ORF 4 encodes the ABC subunit (Figure 6Go).



View larger version (45K):
[in this window]
[in a new window]
 
Fig. 6. Operon organization in E.coli of the genes encoding components of BPD transporters and localization of their product in the transport system. Parallelism is done for B.abortus P39 (1) and ORF 2, 3 and 4 genes and products. Stars represent the substrate and arrows show its transport through the periplasm.

 
Conclusion

Our work constitutes the first attempt to solve structural features of the protein P39, at the limit of the so-called `twilight zone'. To improve the accuracy of predictions, the proposed methodology is based on a combination of methods (sequence similarity searches, secondary structure prediction, fold recognition and alignments) and seeks a consensus at different steps of the modeling procedure.

The model suggests that P39 protein adopts a general periplasmic substrate-binding protein fold, with closer similarities to the maltose/maltodextrin-binding protein fold.

The genetic context suggests that the gene encoding the P39 belongs to a binding protein-dependent transporter operon. The evidence is clear for the sequences (ORFs 2, 3 and 4) located downstream of the P39 ORF, which exhibit high homology with other integral inner membrane components or with the ABC subunit of BDP transporters.

Functional characterization is in progress in order to identify the substrate and localization of the peptide signal of P39, as no amino-terminal signal peptide has been predicted for P39.

Finally, the model provides a first step for designing site-directed mutants in two regions of P39: in the ligand-binding site (residues F59, D60,Q157 and W230) and in the region that should interact with the inner membrane component (residues 211–224). Functional tests still have to be developed to determine the effects of these mutations.

In conclusion, it appears that results from several prediction methods are consistent with each other and agree with the genetic context of the P39 ORF. The fact that the best template available for the modeling of P39 does not share a high homology impeded the construction of a very accurate model of this protein. The model obtained, although rough, is accurate enough to provide plausible hypotheses on the overall fold of P39, on its function and for designing site-directed mutations.


    Acknowledgments
 
The authors thank G.Baudoux, D.Devos, F.Godfroid, F.Melo and J.-Y.Paquet for their collaboration and many fruitful discussions. C.Lambert is supported by a grant from the Government of the Walloon Region of Belgium. A.Tibor is supported by the Commission of the European Communities, contract Eclair AGRE-CT90-OO49-C (EDB).


    Notes
 
1 To whom correspondence should be addressed Katalin.deFays{at}FUNDP.ac.be Back


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) J. Mol. Biol., 215, 403–410.[ISI][Medline]

Altschul,S.F., Madden,T.L., Schäffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389–3402.[Abstract/Free Full Text]

Aszodi,A., Munro,R.E.J. and Taylor,W.R. (1997) Proteins, Suppl.1, 38–42.

Bairoch,A. and Apweiler,R. (1997) Nucleic Acids Res., 26, 38–42.[Abstract/Free Full Text]

Bajorath,J., Stenkamp,R. and Aruffo,A. (1993) Protein Sci., 2, 1798–1810.[Abstract/Free Full Text]

Boos,W. and Lucht,J.M. (1996) In Neidhardt, F.C. (ed.), Escherichia coli and Salmonella. ASM Press, Washington, DC, pp. 1175–1209.Boos,W. and Lucht,J.M. (1996) In Neidhardt, F.C. (ed.), Escherichia coli and Salmonella. ASM Press, Washington, DC, pp. 1175–1209.

Corstjens,P. (1993), Thesis. Rijksuniversiteit te Leiden.

Dassa,E. and Hofnung,M. (1985) EMBO J., 4, 2287–2293.[Abstract]

Deleage,G. and Roux,B. (1987) Protein Engng, 1, 289–294.[Abstract]

Denoël,P., Vo,T., Tibor,A., Weynants,V.E., Trunde,J.-M., Dubray,G., Limet,J.N. and Letesson,J.-J. (1997) Infect. Immun., 65, 495–502.[Abstract]

Depiereux,E., Baudoux,G., Briffeuil,P., Reginster,I., De Bolle,X., Vinals,C. and Feytmans,E. (1997) Comput. Appl. Biosci., 13, 249–256.[Abstract]

Fischer,D. and Eisenberg,D. (1996) Protein Sci., 5, 947–955.[Abstract/Free Full Text]

Frishman,D. and Argos,P. (1997) Proteins, 27, 329–335.[ISI][Medline]

Geourjon,C. and Deleage,G. (1995) Comput. Appl. Biosci., 11, 681–684.[Abstract]

Gibrat,J.F., Garnier,J. and Robson,B. (1987) J. Mol. Biol., 198, 425–443.[ISI][Medline]

Guex,N. and Peitsch,M. C. (1997) Electrophoresis, 18, 2714–2723.[ISI][Medline]

Henikoff,S. and Henikoff,J.G. (1994) Genomics, 19, 97–107.[ISI][Medline]

Holm,L. and Sander,C. (1998) Nucleic Acids Res., 26, 316–319.[Abstract/Free Full Text]

Jones,D.T., Taylor,W.R. and Thornton,J.M. (1992) Nature, 358, 86–89.[ISI][Medline]

King,R.D. and Sternberg,M.J. (1996) Protein Sci., 5, 2298–2310.[Abstract/Free Full Text]

Kneller,D.G., Cohen,F.E. and Langridge,R. (1990) J. Mol. Biol., 214, 171–182.[ISI][Medline]

Kraulis,P.J. (1991) J. Appl. Crystallogr., 24, 946–950.[ISI]

Laskowski,R.A., Moss,D.S. and Thornton,J.M. (1993) J.Mol. Biol., 231, 1049–1067.[ISI][Medline]

Levin,J.M., Robson,B. and Garnier,J. (1986) FEBS Lett., 205, 303–308.[ISI][Medline]

Liebl,W., Stemplinger,I. and Ruile,P. (1997) J. Bacteriol., 179, 941–948.[Abstract]

McClelland,J.L. and Rumelhart,D.E. (1988) Explanations in Parallel Distributed Processing. http://www.impharm.ucsf.edu/rnomi/mmpredict.html. MIT Press, Cambridge, MA, pp. 318–362.

Molecular Simulations (1996) Cerius2 User Guide. Molecular Simulations, San Diego.

Murzin,A.G., Brenner,S.E., Hubbard,T. and Chotia,C. (1995) J. Mol. Biol., 247, 536–540.[ISI][Medline]

Myers,E. and Miller,W. (1988) CABIOS, 4, 11–17.[Abstract]

Nielsen,H., Engelbrecht,J., Brunak,S. and von Heijne,G. (1997) Protein Engng, 10, 1–6.[Abstract]

Overduin,P., Boos,W. and Tommassen,J. (1988) Mol. Microbiol., 2, 767–775.[ISI][Medline]

Pearson,W.R. (1990) Methods Enzymol., 183, 63–98.[ISI][Medline]

Pearson,W.R. and Lipman,D.J. (1988) Proc. Natl Acad. Sci. USA, 85, 2444–2448.[Abstract]

Rice,D.W., Fischer,D., Weiss,R. and Eisenberg,D. (1997) Proteins, Suppl.1, 113–122.

Rost,B. (1995) In Rawling,C. (ed.), The Third International Conference on Intelligent Systems for Molecular Biology (ISMB). AAAI Press, Cambridge, pp. 314–321.

Rost,B. (1996) Methods Enzymol., 266, 525–539.[ISI][Medline]

Rost,B. and Sander,C. (1993) J. Mol. Biol., 232, 584–599.[ISI][Medline]

Rost,B. and Sander,C. (1994) Proteins, 19, 55–72.[ISI][Medline]

Rost,B., Sander,C. and Schneider,R. (1994) CABIOS, 10, 53–60.[Abstract]

Rost,B., Casadio,R., Fariselli,P. and Sander,C. (1995) Protein Sci., 4, 521–533.[Abstract/Free Full Text]

Rost,B., Fariselli,P. and Casadio,R. (1996) Protein Sci., 7, 1704–1718.

Russell,R.R., Aduse-Opoku,J., Sutcliffe,I.C., Tao,L. and Ferretti,J.J. (1992) J. Biol. Chem., 267, 4631–4637.[Abstract/Free Full Text]

Sali,A. and Blundell,T.L. (1993) J. Mol. Biol., 234, 779–815.[ISI][Medline]

Sharff,A.J., Rodseth,L.E., Spurlino,J.C. and Quiocho,F.A. (1992) Biochemistry, 31, 10657–10663.[ISI][Medline]

Sharff,A.J., Rodseth,L.E. and Quiocho,F.A. (1993) Biochemistry, 32, 10553–10559.[ISI][Medline]

Sippl,M.J. (1993) Proteins, 17, 355–362.[ISI][Medline]

Sippl,M.J. and Weitckus,S. (1992) Proteins, 13, 258–271.[ISI][Medline]

Solovyev,V.V. and Salamov,A.A. (1994) CABIOS, 10, 661–669.[Abstract]

Spurlino,J.C., Lu,G.-Y. and Quiocho,F.A. (1991) J. Biol. Chem., 266, 5202–5219.[Abstract/Free Full Text]

Thompson,D.J., Higgins,D.G. and Gibson,T.J. (1994) Nucleic Acids Res., 22, 4673–4680.[Abstract]

Tramontano,A. (1998) METHODS: Companion Methods Enzymol., 14, 293–300.[ISI]

van Wezel,G.P., White,J., Young,P., Postma,P.W. and Bibb,M.J. (1997) Mol. Microbiol., 23, 537–549.[ISI][Medline]

Vinals,C., De Bolle,X., Depiereux,E. and Feytmans,E. (1995) Proteins, 21, 307–318.[ISI][Medline]

Wouters,J. and Baudoux,G. (1998) Proteins, 32, 97–110.[Medline]

Received July 20, 1998; revised November 26, 1998; accepted December 11, 1998.