4.7 Article

Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction

Journal

BRIEFINGS IN BIOINFORMATICS
Volume 22, Issue 6, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab136

Keywords

Forman Ricci curvature; molecular featurization; machine learning; drug design

Funding

  1. Nanyang Technological University [M4081842.110]
  2. Singapore Ministry of Education Academic Research fund (Tier 1) [RG109/19]
  3. Singapore Ministry of Education Academic Research fund (Tier 2) [MOE2018-T2-1-033]

Ask authors/readers for more resources

Artificial intelligence techniques have been applied to the entire drug design process, with molecular featurization being a central challenge for AI-based drug design success.
Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug design is molecular featurization, which is to identify or design appropriate molecular descriptors or fingerprints. Efficient and transferable molecular descriptors are key to the success of all AI-based drug design models. Here we propose Forman persistent Ricci curvature (FPRC)-based molecular featurization and feature engineering, for the first time. Molecular structures and interactions are modeled as simplicial complexes, which are generalization of graphs to their higher dimensional counterparts. Further, a multiscale representation is achieved through a filtration process, during which a series of nested simplicial complexes at different scales are generated. Forman Ricci curvatures (FRCs) are calculated on the series of simplicial complexes, and the persistence and variation of FRCs during the filtration process is defined as FPRC. Moreover, persistent attributes, which are FPRC-based functions and properties, are employed as molecular descriptors, and combined with machine learning models, in particular, gradient boosting tree (GBT). Our FPRC-GBT models are extensively trained and tested on three most commonly-used datasets, including PDBbind-2007, PDBbind-2013 and PDBbind-2016. It has been found that our results are better than the ones from machine learning models with traditional molecular descriptors.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available