☆ 4.7 Article

Biomolecular simulation based machine learning models accurately predict sites of tolerability to the unnatural amino acid acridonylalanine

SCIENTIFIC REPORTS (2021)

Journal

SCIENTIFIC REPORTS

Volume 11, Issue 1, Pages -

Publisher

NATURE PORTFOLIO

DOI: 10.1038/s41598-021-97965-2

Keywords

Funding

University of Pennsylvania
National Science Foundation (NSF) [CHE-1708759]
NSF through the NSF Graduate Research Fellowship Program [DGE-1321851, DGE-1845298]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The newly developed scoring functions accurately predict the impact of unnatural amino acids on protein yield and solubility, revealing the crucial role in predicting mutation tolerance. This study demonstrates that extracting features from structural models and applying them to machine learning can accurately predict diverse and abstract biological phenomena in biological systems.

The incorporation of unnatural amino acids (Uaas) has provided an avenue for novel chemistries to be explored in biological systems. However, the successful application of Uaas is often hampered by site-specific impacts on protein yield and solubility. Although previous efforts to identify features which accurately capture these site-specific effects have been unsuccessful, we have developed a set of novel Rosetta Custom Score Functions and alternative Empirical Score Functions that accurately predict the effects of acridon-2-yl-alanine (Acd) incorporation on protein yield and solubility. Acd-containing mutants were simulated in PyRosetta, and machine learning (ML) was performed using either the decomposed values of the Rosetta energy function, or changes in residue contacts and bioinformatics. Using these feature sets, which represent Rosetta score function specific and bioinformatics-derived terms, ML models were trained to predict highly abstract experimental parameters such as mutant protein yield and solubility and displayed robust performance on well-balanced holdouts. Model feature importance analyses demonstrated that terms corresponding to hydrophobic interactions, desolvation, and amino acid angle preferences played a pivotal role in predicting tolerance of mutation to Acd. Overall, this work provides evidence that the application of ML to features extracted from simulated structural models allow for the accurate prediction of diverse and abstract biological phenomena, beyond the predictivity of traditional modeling and simulation approaches.

Biomolecular simulation based machine learning models accurately predict sites of tolerability to the unnatural amino acid acridonylalanine

Journal

SCIENTIFIC REPORTS

Publisher

NATURE PORTFOLIO

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Biomolecular simulation based machine learning models accurately predict sites of tolerability to the unnatural amino acid acridonylalanine

Journal

SCIENTIFIC REPORTS

Publisher

NATURE PORTFOLIO

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper