4.7 Article

Three Simple Properties Explain Protein Stability Change upon Mutation

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING
Volume 61, Issue 4, Pages 1981-1988

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jcim.1c00201

Keywords

-

Funding

  1. Danish Council for Independent Research [8022-00041B]

Ask authors/readers for more resources

In this study, a simple multilinear regression model named SimBa is used for protein stability prediction, considering only solvent accessibility, volume difference, and polarity difference caused by mutation. The results show that this straightforward model performs comparably to more complex methods, indicating a hard limit in numerical accuracy and trend accuracy. New features are required to improve accuracy beyond this limit.
Accurate prediction of protein stability upon mutation enables rational engineering of new proteins and insights into protein evolution and monogenetic diseases caused by single-point amino acid substitutions. Many tools have been developed to this aim, ranging from energy-based models to machine-learning methods that use large amounts of experimental data. However, as the methods become more complex, the interpretation of the chemistry underlying the protein stability effects becomes obscure. It is thus of interest to identify the simplest prediction model that retains complete amino acid specific interpretation; for a given number of input descriptors, we expect such a model to be almost universal. In this study, we identify such a limiting model, SimBa, a simple multilinear regression model trained on a substitution-type-balanced experimental data set. The model accounts only for the solvent accessibility of the site, volume difference, and polarity difference caused by mutation. Our results show that this very simple and directly applicable model performs comparably to other much more complex, widely used protein stability prediction methods. This suggests that a hard limit of similar to 1 kcal/mol numerical accuracy and an R similar to 0.5 trend accuracy exists and that new features, such as account of unfolded states, water colocalization, and amino acid correlations, are required to improve accuracy to, e.g., 1/2 kcal/mol.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available