4.7 Review

Data considerations for predictive modeling applied to the discovery of bioactive natural products

Related references

Note: Only part of the references are listed.
Review Pharmacology & Pharmacy

How doppelganger effects in biomedical data confound machine learning

Li Rong Wang et al.

Summary: Machine learning models are widely used in drug development. However, the presence of data doppelgangers can affect the reliability of evaluation methods. This study demonstrates the prevalence of data doppelgangers in biomedical data and provides recommendations to mitigate the doppelganger effect.

DRUG DISCOVERY TODAY (2022)

Article Biochemistry & Molecular Biology

AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models

Mihaly Varadi et al.

Summary: AlphaFold DB is an openly accessible database with high-accuracy protein-structure predictions, powered by DeepMind's AlphaFold v2.0. It provides programmatic access to a vast number of predicted structures and is expanding to cover more sequences.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemical Research Methods

MolTrans: Molecular Interaction Transformer for drug-target interaction prediction

Kexin Huang et al.

Summary: The MolTrans model improves the accuracy and interpretability of drug-target interaction prediction through knowledge-inspired sub-structural pattern mining algorithm and augmented transformer encoder, better extracting and capturing semantic relations among sub-structures extracted from massive unlabeled biomedical data.

BIOINFORMATICS (2021)

Article Chemistry, Medicinal

Deep learning enables discovery of highly potent anti-osteoporosis natural products

Zhihong Liu et al.

Summary: The pre-trained self-attentive message passing neural network (P-SAMPNN) model was developed based on anti-osteoclastogenesis dataset for virtual screening. The model outperformed other baseline models and resulted in the identification of two nanomolar-level inhibitors against osteoclastogenesis with a new scaffold. Subsequent studies showed that these compounds significantly suppressed the expression of specific genes.

EUROPEAN JOURNAL OF MEDICINAL CHEMISTRY (2021)

Article Chemistry, Medicinal

Explainable Deep Relational Networks for Predicting Compound- Protein Affinities and Contacts

Mostafa Karimi et al.

Summary: This study utilizes deep learning to improve the interpretability of compound-protein affinity prediction by defining intermolecular contacts. By embedding protein sequences and compound graphs with joint attentions, as well as introducing three methodological advances, it achieves high accuracy in affinity prediction. Compared to other methods, our models show better performance in both affinity prediction and contact prediction.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2021)

Article Biochemistry & Molecular Biology

UniProt: the universal protein knowledgebase in 2021

Alex Bateman et al.

Summary: The UniProt Knowledgebase aims to provide users with a comprehensive, high-quality set of protein sequences annotated with functional information. Updates over the past two years have increased the number of sequences to approximately 190 million, with new methods to assess proteome completeness and quality. UniProtKB has responded to the COVID-19 pandemic by expertly curating relevant entries and making them rapidly available through a dedicated portal.

NUCLEIC ACIDS RESEARCH (2021)

Article Plant Sciences

NPClassifier: A Deep Neural Network-Based Structural Classification Tool for Natural Products

Hyun Woo Kim et al.

Summary: Computational approaches like genome and metabolome mining are crucial in the research of natural products (NPs). An automated structure-type classification system is necessary to handle the vast amount of data on NP structures. NPClassifier, a deep-learning tool introduced in this study, is expected to accelerate NP discovery by linking structure to properties.

JOURNAL OF NATURAL PRODUCTS (2021)

Review Pharmacology & Pharmacy

Graph neural networks for automated de novo drug design

Jiacheng Xiong et al.

Summary: De novo drug design aims to create novel chemical entities with desired properties, with the recent popularity of data-driven methods utilizing artificial intelligence technologies like graph neural networks (GNNs). The applications of GNNs in drug design include molecule scoring, generation, optimization, and synthesis planning, with discussions on current challenges and future directions in this field.

DRUG DISCOVERY TODAY (2021)

Review Biotechnology & Applied Microbiology

Natural products in drug discovery: advances and opportunities

Atanas G. Atanasov et al.

Summary: Natural products and their analogues have historically played a significant role in pharmacotherapy, however, they also present challenges. Recent technological and scientific developments are addressing these challenges and revitalizing interest in natural products as drug leads, particularly for combating antimicrobial resistance.

NATURE REVIEWS DRUG DISCOVERY (2021)

Review Biochemical Research Methods

Deep drug-target binding affinity prediction with multiple attention blocks

Yuni Zeng et al.

Summary: In this study, an end-to-end model with multiple attention blocks was proposed to predict the binding affinity scores of drug-target pairs. The model encodes correlations between atoms using a relation-aware self-attention block and models the interaction between drug and target representations using a multi-head attention block. Experimental results show that the proposed approach outperforms existing methods by benefiting from encoded correlation and extracted interaction information.

BRIEFINGS IN BIOINFORMATICS (2021)

Article Biochemistry & Molecular Biology

Target Prediction Model for Natural Products Using Transfer Learning

Bo Qiang et al.

Summary: The model utilizing transfer learning to predict targets for natural products achieved a promising AUROC score of 0.910, indicating its potential in the field of drug discovery.

INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES (2021)

Article Multidisciplinary Sciences

Highly accurate protein structure prediction with AlphaFold

John Jumper et al.

Summary: Proteins are essential for life, and accurate prediction of their structures is a crucial research problem. Current experimental methods are time-consuming, highlighting the need for accurate computational approaches to address the gap in structural coverage. Despite recent progress, existing methods fall short of atomic accuracy in protein structure prediction.

NATURE (2021)

Article Biochemistry & Molecular Biology

PubChem in 2021: new data content and improved web interfaces

Sunghwan Kim et al.

Summary: PubChem, a popular chemical information resource, has made substantial improvements in the past two years by adding data from over 100 new sources, updating its homepage and record pages, introducing new services like the Periodic Table and Pathway pages, and creating a special data collection related to COVID-19 and SARS-CoV-2 in response to the pandemic.

NUCLEIC ACIDS RESEARCH (2021)

Review Chemistry, Multidisciplinary

Review on natural products databases: where to find data in 2020

Maria Sorokina et al.

JOURNAL OF CHEMINFORMATICS (2020)

Review Cardiac & Cardiovascular Systems

Selected Indonesian Medicinal Plants for the Management of Metabolic Syndrome: Molecular Basis and Recent Studies

Wawaimuli Arozal et al.

FRONTIERS IN CARDIOVASCULAR MEDICINE (2020)

Review Pharmacology & Pharmacy

Artificial intelligence in drug discovery and development

Debleena Paul et al.

DRUG DISCOVERY TODAY (2020)

Review Chemistry, Multidisciplinary

Molecular representations in AI-driven drug discovery: a review and practical guide

Laurianne David et al.

JOURNAL OF CHEMINFORMATICS (2020)

Article Pharmacology & Pharmacy

A Deep Learning-Based Approach for Identifying the Medicinal Uses of Plant-Derived Natural Compounds

Sunyong Yoo et al.

FRONTIERS IN PHARMACOLOGY (2020)

Article Computer Science, Artificial Intelligence

Extensions of the External Validation for Checking Learned Model Interpretability and Generalizability

Sung Yang Ho et al.

PATTERNS (2020)

Review Chemistry, Multidisciplinary

QSAR without borders

Eugene N. Muratov et al.

CHEMICAL SOCIETY REVIEWS (2020)

Article Chemistry, Physical

Are 2D fingerprints still valuable for drug discovery?

Kaifu Gao et al.

PHYSICAL CHEMISTRY CHEMICAL PHYSICS (2020)

Article Biochemical Research Methods

admetSAR 2.0: web-service for prediction and optimization of chemical ADMET properties

Hongbin Yang et al.

BIOINFORMATICS (2019)

Review Pharmacology & Pharmacy

Turning straw into gold: building robustness into gene signature inference

Wilson Wen Bin Goh et al.

DRUG DISCOVERY TODAY (2019)

Article Biochemistry & Molecular Biology

Mathematical deep learning for pose and binding affinity prediction and ranking in D3R Grand Challenges

Duc Duy Nguyen et al.

JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN (2019)

Article Engineering, Biomedical

DG-GL: Differential geometry-based geometric learning of molecular datasets

Duc Duy Nguyen et al.

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING (2019)

Article Biochemical Research Methods

Predicting Meridian in Chinese traditional medicine using machine learning approaches

Yinyin Wang et al.

PLOS COMPUTATIONAL BIOLOGY (2019)

Article Biochemistry & Molecular Biology

SymMap: an integrative database of traditional Chinese medicine enhanced by symptom mapping

Yang Wu et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Biochemistry & Molecular Biology

ChEMBL: towards direct deposition of bioassay data

David Mendez et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Engineering, Biomedical

Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction

Zixuan Cang et al.

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING (2018)

Article Biochemistry & Molecular Biology

DrugBank 5.0: a major update to the DrugBank database for 2018

David S. Wishart et al.

NUCLEIC ACIDS RESEARCH (2018)

Article Chemistry, Multidisciplinary

Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators

Daniel Merk et al.

COMMUNICATIONS CHEMISTRY (2018)

Article Chemistry, Applied

Sweetness prediction of natural compounds

Jean-Baptiste Cheron et al.

FOOD CHEMISTRY (2017)

Article Chemistry, Medicinal

Rigidity Strengthening: A Mechanism for Protein-Ligand Binding

Duc D. Nguyen et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2017)

Article Chemistry, Medicinal

A Simple Representation of Three-Dimensional Molecular Structure

Seth D. Axen et al.

JOURNAL OF MEDICINAL CHEMISTRY (2017)

Article Biochemistry & Molecular Biology

BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology

Michael K. Gilson et al.

NUCLEIC ACIDS RESEARCH (2016)

Review Chemistry, Multidisciplinary

Counting on natural products for drug design

Tiago Rodrigues et al.

NATURE CHEMISTRY (2016)

Article Biochemical Research Methods

Molecular fingerprint similarity search in virtual screening

Adria Cereto-Massague et al.

METHODS (2015)

Article Chemistry, Multidisciplinary

Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?

David Bajusz et al.

JOURNAL OF CHEMINFORMATICS (2015)

Article Biochemistry & Molecular Biology

Super Natural II-a database of natural products

Priyanka Banerjee et al.

NUCLEIC ACIDS RESEARCH (2015)

Article Chemistry, Medicinal

QSAR Modeling: Where Have You Been? Where Are You Going To?

Artem Cherkasov et al.

JOURNAL OF MEDICINAL CHEMISTRY (2014)

Article Chemistry, Multidisciplinary

TCMSP: a database of systems pharmacology for drug discovery from herbal medicines

Jinlong Ru et al.

JOURNAL OF CHEMINFORMATICS (2014)

Review Biochemistry & Molecular Biology

Natural products: A continuing source of novel drug leads

Gordon M. Cragg et al.

BIOCHIMICA ET BIOPHYSICA ACTA-GENERAL SUBJECTS (2013)

Article Chemistry, Medicinal

SMIfp (SMILES fingerprint) Chemical Space for Virtual Screening and Visualization of Large Databases of Organic Molecules

Julian Schwartz et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2013)

Article Chemistry, Medicinal

Time-Split Cross-Validation as a Method for Estimating the Goodness of Prospective Prediction

Robert P. Sheridan

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2013)

Article Chemistry, Multidisciplinary

Open-source platform to benchmark fingerprints for ligand-based virtual screening

Sereina Riniker et al.

JOURNAL OF CHEMINFORMATICS (2013)

Article Biotechnology & Applied Microbiology

Exploring the human diseasome: the human disease network

Kwang-Il Goh et al.

BRIEFINGS IN FUNCTIONAL GENOMICS (2012)

Article Chemistry, Medicinal

Performance Evaluation of 2D Fingerprint and 3D Shape Similarity Methods in Virtual Screening

Guoping Hu et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2012)

Article Chemistry, Multidisciplinary

In silico toxicity prediction by support vector machine and SMILES representation-based string kernel

D. -S. Cao et al.

SAR AND QSAR IN ENVIRONMENTAL RESEARCH (2012)

Article Biochemistry & Molecular Biology

Modeling Natural Anti-Inflammatory Compounds by Molecular Topology

Maria Galvez-Llompart et al.

INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES (2011)

Article Chemistry, Multidisciplinary

PaDEL-Descriptor: An Open Source Software to Calculate Molecular Descriptors and Fingerprints

Chun Wei Yap

JOURNAL OF COMPUTATIONAL CHEMISTRY (2011)

Article Biochemistry & Molecular Biology

The discovery of artemisinin (qinghaosu) and gifts from Chinese medicine

Youyou Tu

NATURE MEDICINE (2011)

Article Chemistry, Multidisciplinary

Open Babel: An open chemical toolbox

Noel M. O'Boyle et al.

JOURNAL OF CHEMINFORMATICS (2011)

Article Chemistry, Medicinal

Extended-Connectivity Fingerprints

David Rogers et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2010)

Article Multidisciplinary Sciences

The human disease network

Kwang-Il Goh et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2007)

Article Computer Science, Information Systems

The wisdom hierarchy: representations of the DIKW hierarchy

Jennifer Rowley

JOURNAL OF INFORMATION SCIENCE (2007)

Article Biotechnology & Applied Microbiology

What is a support vector machine?

William S. Noble

NATURE BIOTECHNOLOGY (2006)

Letter Plant Sciences

Traditional Chinese medicine information database

ZL Ji et al.

JOURNAL OF ETHNOPHARMACOLOGY (2006)

Article Chemistry, Medicinal

Lead hopping using SVM and 3D pharmacophore fingerprints

JC Saeh et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2005)

Article Chemistry, Multidisciplinary

Reoptimization of MDL keys for use in drug discovery

JL Durant et al.

JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES (2002)

Article Biochemistry & Molecular Biology

The Protein Data Bank

HM Berman et al.

NUCLEIC ACIDS RESEARCH (2000)