1Max-Planck-Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany and 2CRIBI Biotechnology Center, University of Padova, Viale G. Colombo 3, 35121 Padova, Italy M.Albrecht and S.C.E.Tosatto contributed equally to this work.
3 To whom correspondence should be addressed. e-mail: mario.albrecht{at}mpi-sb.mpg.de; silvio{at}cribi.unipd.it
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: consensus formation/protein structure prediction/secondary structure
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
We found that a set of only three state-of-the-art methods combined using majority voting is sufficient to achieve similar improvements in the prediction accuracy. This simple approach runs at low computational cost, but uses the currently best prediction servers.
In order to test our approach, we participated in the critical assessment of structure prediction, the CASP5 experiment of the year 2002 (Tramontano, 2003). We combined the prediction results of the three servers PSIPRED, SAM-T02 and SSpro2, which are based on different prediction approaches using neural networks and hidden Markov models. The three servers have shown top performance in former CASP experiments and the continuous automatic evaluation (EVA) of protein structure prediction servers (Eyrich et al., 2001
; Rost and Eyrich, 2001
) and have higher overall accuracy than older combination methods such as Jpred.
Astonishingly, our method significantly outperformed almost all other methods participating in CASP5 and reached the second rank below a manual expert submission according to the SOV score (Rost et al., 1994; Zemla et al., 1999
), normalized with respect to the total number of all 78 target protein domain sequences (for details see http://www.russell.embl.de/casp5/). In particular, our method ranked first regarding the SOV accuracy measure for a subset of 21 target domains unrelated by sequence and with low sequence similarity to known protein structures. Regarding the alternative Q3 score (Rost and Eyrich, 2001
), our combination method ranks first for the set of all targets and the subset of sequence unrelated targets.
Encouraged by the CASP5 results, we decided to investigate our approach on larger benchmark sets obtained from EVA. In particular, we show that our approach always improves the prediction accuracy over the best single method of the three methods combined to form the consensus. Comparing the frequencies of the occurrence of certain majority situations, we are able to draw interesting conclusions on the degree of similarity between results of single prediction methods and on the increased confidence in consistently predicted secondary structure.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
In our evaluation, we used the three benchmark sets common2, common5 and common6 from 22 September 2002 with sequences of low identity, as provided by the EVA web site (http://cubic.bioc.columbia.edu/~eva/). The set common2 contains 121 sequences with 16 858 amino acids, common5 contains 214 sequences with 44 871 amino acids and common6 contains 539 sequences with 98 308 residues. Because not all methods have returned predictions for every sequence requested by EVA, not every benchmark set could be combined with the same three methods used for consensus computation (see footnote of Table I).
Consensus formation
For each benchmark set, three single methods of top performance in EVA are selected in order to compute the consensus secondary structure sequence by majority voting. Specifically, we used the results of the following seven prediction methods: PSIPRED (Jones, 1999; McGuffin et al., 2000
), SAM-T99 (Karplus et al., 1998
), SSpro1 (Baldi et al., 1999
), SSpro2 (Pollastri et al., 2002
), PHDpsi (Przybylski and Rost, 2002
), PROFsec (Rost and Eyrich, 2001
) and Jpred (Cuff and Barton, 2000
).
Three cases need to be distinguished when forming the consensus sequence per amino acid according to the three possible secondary states -helix (H), ß-strand (E) and other/loop (L). 3:0 votes means consistent prediction among all three methods. 2:1 votes result in the majority decision. The rare case of a tie 1:1:1 is resolved by assuming the L state. Each consensus sequence is annotated with a confidence array, which contains values ranging from 1 to 3 according to the maximum number of identical votes per residue.
Prediction accuracy
To determine the prediction accuracy, we compared the predicted consensus sequence with the true three-state sequence derived from the DSSP secondary structure assignment of known 3D structures (Kabsch and Sander, 1983). Each of the three possible states H, E and L results from the collapse transformation of the eight DSSP states per residue according to the following schema (Rost and Eyrich, 2001
): {G, H, I}
-helix (H), {B, E}
ß-strand (E), {S, T, .}
other (L).
For each benchmark set, we computed average Q3 and SOV percentage values (Rost et al., 1994; Zemla et al., 1999
; Rost and Eyrich, 2001
) as well as the separate percentages QH, QE and QL of residues predicted correctly in the observed H, E and L states, respectively. For each accuracy measure, we calculated the standard error by dividing the standard deviation of the measure by the square root of the benchmark set size. Based on the assumption of a Gaussian distribution of the accuracy measures with similar standard deviations as known from observations, the accuracy difference between two distinct prediction methods is said to be statistically significant if it is larger than the standard error (Rost and Eyrich, 2001
).
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The results of the consensus formation by majority voting using three different benchmark sets are summarized in Table I. The comparison of our consensus approach with the respective best single method demonstrates that the total average Q3 accuracy is increased significantly (in the sense described above) by 1.45, 1.50 and 0.41 percentage points for all three sets common2, common5 and common6, respectively. In addition, the SOV measure is improved by 0.68 percentage points for the common5 set, while it does not change considerably for the other two sets. Table I also contains the results of the consensus prediction method Jpred as available for the sets common2 and common5, but its accuracy is generally clearly below those of other methods. For comparison, we included the results of PROFsec, another top-performing single prediction method, in Table I for the common2 set. Its Q3 prediction accuracy shows a significantly lower performance of 1.271.57 percentage points in contrast to the very similar consensus results of any three single methods combined out of the four available methods, PSIPRED, SAM-T99, SSpro and PROFsec.
|
Filtering of prediction results
We found that the application of a trivial filtering procedure that eliminates -helices and ß-strands that are too short, generally neither deteriorates nor ameliorates the prediction accuracy significantly, be it before and/or after the consensus formation (supplementary data is available at Protein Engineering online). This kind of structural filtering can be employed without disadvantages in order to clean up the secondary structure predictions before further processing.
Frequency of majority situations
The additional analysis of the overall frequency of the three possible types of majority situations 3:0, 2:1 and tie 1:1:1 uncovers that the problematic case of a tie with each of the three single methods predicting a different secondary structure occurs in at most 1% of all cases (Table II). Thus, the tie case can be neglected when applying our consensus approach.
|
Prediction confidence
We also verified the intuitive expectation that the confidence in the correctness of the prediction is increased by consensus formation. We found that the Q3 and SOV values computed solely for secondary structure states that are consistently predicted by all three methods are much higher than overall values with an increase of 6.326.62 and 5.065.46 percentage points for Q3 and SOV, respectively (Table III). Similar results are obtained after separating the Q3 value into the three secondary structure classes: QH and QE are increased by 4.126.03 and 2.884.31 percentage points, respectively, while QL is increased on average by 5.836.20 percentage points.
|
In summary, we discovered that a simple consensus approach based on the majority voting of solely three prediction methods can be superior to each of the three methods as well as to complex combinations of more than three single prediction methods as employed in Jpred. Our method was proven to work with distinct combinations of different prediction methods on large benchmark sets. Presumably, the success of the method is mainly due to the use of three of the currently best single methods and the noise-filtering properties of a consensus approach, which helps to ignore the training errors of single methods. We believe that any three state-of-the-art prediction methods can be used for the consensus. The method is less expensive computationally than other consensus approaches and has the advantage of not requiring the calibration of involved parameters.
![]() |
Acknowledgements |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Baldi,P., Brunak,S., Frasconi,P., Soda,G. and Pollastri,G. (1999) Bioinformatics, 15, 937946.
Chandonia,J.M. and Karplus,M. (1999) Proteins, 35, 293306.[CrossRef][ISI][Medline]
Cuff,J.A. and Barton,G.J. (1999) Proteins, 34, 508519.[CrossRef][ISI][Medline]
Cuff,J.A. and Barton,G.J. (2000) Proteins, 40, 502511.[CrossRef][ISI][Medline]
Cuff,J.A., Clamp,M.E., Siddiqui,A.S., Finlay,M. and Barton,G.J. (1998) Bioinformatics, 14, 892893.[Abstract]
Eyrich,V.A., Marti-Renom,M.A., Przybylski,D., Madhusudhan,M.S., Fiser,A., Pazos,F., Valencia,A., Sali,A. and Rost,B. (2001) Bioinformatics, 17, 12421243.
Guermeur,Y., Geourjon,C., Gallinari,P. and Deléage,G. (1999) Bioinformatics, 15, 413421.
Jones,D.T. (1999) J. Mol. Biol., 292, 195202.[CrossRef][ISI][Medline]
Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 25772637.[ISI][Medline]
Karplus,K., Barrett,C. and Hughey,R. (1998) Bioinformatics, 14, 846856.[Abstract]
King,R.D., Ouali,M., Strong,A.T., Aly,A., Elmaghraby,A., Kantardzic,M. and Page,D. (2000) Protein Eng., 13, 1519.
McGuffin,L.J., Bryson,K. and Jones,D.T. (2000) Bioinformatics, 16, 404405.[Abstract]
Ouali,M. and King,R.D. (2000) Protein Sci., 9, 11621176.[Abstract]
Petersen,T.N., Lundegaard,C., Nielsen,M., Bohr,H., Bohr,J., Brunak,S., Gippert,G.P. and Lund,O. (2000) Proteins, 41, 1720.[CrossRef][ISI][Medline]
Pollastri,G., Przybylski,D., Rost,B. and Baldi,P. (2002) Proteins, 47, 228235.[CrossRef][ISI][Medline]
Przybylski,D. and Rost,B. (2002) Proteins, 46, 197205.[CrossRef][ISI][Medline]
Rost,B. (2001) J. Struct. Biol., 134, 204218.[ISI][Medline]
Rost,B. and Eyrich,V.A. (2001) Proteins, 45, Suppl. 5, 192199.[CrossRef]
Rost,B., Sander,C. and Schneider,R. (1994) J. Mol. Biol., 235, 1326.[CrossRef][ISI][Medline]
Selbig,J., Mevissen,T. and Lengauer,T. (1999) Bioinformatics, 15, 10391046.
Tramontano,A. (2003) Nature Struct. Biol., 10, 8790.[CrossRef][ISI][Medline]
Zemla,A., Venclovas,C., Fidelis,K. and Rost,B. (1999) Proteins, 34, 220223.[CrossRef][ISI][Medline]
Received March 7, 2003; revised May 24, 2003; accepted June 6, 2003.