4.6 Article

A base measure of precision for protein stability predictors: structural sensitivity

Journal

BMC BIOINFORMATICS
Volume 22, Issue 1, Pages -

Publisher

BMC
DOI: 10.1186/s12859-021-04030-w

Keywords

Protein stability; Mutation; Computation; Protein structure; Structural sensitivity

Funding

  1. Danish Council for Independent Research [8022-00041B]

Ask authors/readers for more resources

The study analyzed the structural sensitivity of stability prediction methods for proteins and found that the methods can be grouped into two categories with varying sensitivity. Accuracy correlates with precision for mutation-type-balanced data sets, highlighting the importance of balance in mutation types. Machine-learning methods may underestimate the significance of protein structure compared to side-chain-sensitive methods.
BackgroundPrediction of the change in fold stability (Delta Delta G) of a protein upon mutation is of major importance to protein engineering and screening of disease-causing variants. Many prediction methods can use 3D structural information to predict Delta Delta G. While the performance of these methods has been extensively studied, a new problem has arisen due to the abundance of crystal structures: How precise are these methods in terms of structure input used, which structure should be used, and how much does it matter? Thus, there is a need to quantify the structural sensitivity of protein stability prediction methods.ResultsWe computed the structural sensitivity of six widely-used prediction methods by use of saturated computational mutagenesis on a diverse set of 87 structures of 25 proteins. Our results show that structural sensitivity varies massively and surprisingly falls into two very distinct groups, with methods that take detailed account of the local environment showing a sensitivity of similar to 0.6 to 0.8 kcal/mol, whereas machine-learning methods display much lower sensitivity (similar to 0.1 kcal/mol). We also observe that the precision correlates with the accuracy for mutation-type-balanced data sets but not generally reported accuracy of the methods, indicating the importance of mutation-type balance in both contexts.ConclusionsThe structural sensitivity of stability prediction methods varies greatly and is caused mainly by the models and less by the actual protein structural differences. As a new recommended standard, we therefore suggest that Delta Delta G values are evaluated on three protein structures when available and the associated standard deviation reported, to emphasize not just the accuracy but also the precision of the method in a specific study. Our observation that machine-learning methods deemphasize structure may indicate that folded wild-type structures alone, without the folded mutant and unfolded structures, only add modest value for assessing protein stability effects, and that side-chain-sensitive methods overstate the significance of the folded wild-type structure.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available