☆ 4.6 Article

Benchmarking Ligand-Based Virtual High-Throughput Screening with the PubChem Database

MOLECULES (2013)

Journal

MOLECULES

Volume 18, Issue 1, Pages 735-756

Publisher

MDPI

DOI: 10.3390/molecules18010735

Keywords

virtual screening; machine learning; quantitative structure-activity relations (QSAR); high-throughput screening (HTS); cheminformatics; PubChem; BCL

Funding

NIH [R01 MH090192, R01 GM080403]
NSF [Career 0742762, 0959454]
NIH through the CI-TraCS fellowship [OCI-1122919]
Office of Advanced Cyberinfrastructure (OAC)
Direct For Computer & Info Scie & Enginr [0959454] Funding Source: National Science Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

With the rapidly increasing availability of High-Throughput Screening (HTS) data in the public domain, such as the PubChem database, methods for ligand-based computer-aided drug discovery (LB-CADD) have the potential to accelerate and reduce the cost of probe development and drug discovery efforts in academia. We assemble nine data sets from realistic HTS campaigns representing major families of drug target proteins for benchmarking LB-CADD methods. Each data set is public domain through PubChem and carefully collated through confirmation screens validating active compounds. These data sets provide the foundation for benchmarking a new cheminformatics framework BCL:: ChemInfo, which is freely available for non-commercial use. Quantitative structure activity relationship (QSAR) models are built using Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Decision Trees (DTs), and Kohonen networks (KNs). Problem-specific descriptor optimization protocols are assessed including Sequential Feature Forward Selection (SFFS) and various information content measures. Measures of predictive power and confidence are evaluated through cross-validation, and a consensus prediction scheme is tested that combines orthogonal machine learning algorithms into a single predictor. Enrichments ranging from 15 to 101 for a TPR cutoff of 25% are observed.

Benchmarking Ligand-Based Virtual High-Throughput Screening with the PubChem Database

Journal

MOLECULES

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Benchmarking Ligand-Based Virtual High-Throughput Screening with the PubChem Database

Journal

MOLECULES

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper