4.7 Article

A Fast and Interpretable Deep Learning Approach for Accurate Electrostatics-Driven pKa Predictions in Proteins

Journal

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jctc.2c00308

Keywords

-

Funding

  1. FCT [SFRH/BD/136226/2018, CEECIND/02300/2017, UIDB/04046/2020, UIDP/04046/2020]
  2. European Union [101017567]
  3. Bayer AG Life Science Collaboration (Explainable AI)

Ask authors/readers for more resources

In this study, deep learning models trained on a dataset of 6 million theoretically determined pK(a) shifts successfully inferred the electrostatic contributions of different chemical groups and the importance of solvent exposure. The models demonstrated the best accuracy in a test set and significantly outperformed physics-based methods in terms of inference speed.
Existing computational methods for estimating pK(a) values in proteins rely on theoretical approximations and lengthy computations. In this work, we use a data set of 6 million theoretically determined pK(a) shifts to train deep learning models, which are shown to rival the physics-based predictors. These neural networks managed to infer the electrostatic contributions of different chemical groups and learned the importance of solvent exposure and close interactions, including hydrogen bonds. Although trained only using theoretical data, our pKAI+ model displayed the best accuracy in a test set of similar to 750 experimental values. Inference times allow speedups of more than 1000x compared to physics-based methods. By combining speed, accuracy, and a reasonable understanding of the underlying physics, our models provide a game-changing solution for fast estimations of macroscopic pK(a) values from ensembles of microscopic values as well as for many downstream applications such as molecular docking and constant-pH molecular dynamics simulations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available