4.4 Article

Cross-Mapping of Protein - Ligand Binding Data Between ChEMBL and PDBbind

Journal

MOLECULAR INFORMATICS
Volume 34, Issue 8, Pages 568-576

Publisher

WILEY-V C H VERLAG GMBH
DOI: 10.1002/minf.201500010

Keywords

Public database; Binding data; Proteinligand complex; Drug discovery

Funding

  1. Chinese National Natural Science Foundation [21072213, 81172984, 21472227, 81430083, 21102168, 21102165, 21402230]
  2. Chinese Ministry of Science and Technology (863 High-Tech Program) [2012AA020308]
  3. Science and Technology Development Fund of Macao SAR [055/2013/A2]
  4. MSD China

Ask authors/readers for more resources

The ChEMBL database is a valuable open data source, which provides a comprehensive collection of binding data, functional and ADMET properties of bioactive compounds. The PDBbind database has a more focused scope, i.e. collecting binding data for the protein-ligand complexes in the Protein Data Bank. Currently, the PDBbind collection of binding data is rather modest as compared to the ChEMBL collection (approximate to 13000 versus approximate to 1.3 million). One may suspect if the former is actually a subset of the latter. In this study, we mapped the molecular information and protein-ligand binding data in PDBbind to the records in ChEMBL, and then analyzed the overlap between the binding data recorded in these two databases. Our results indicate that only approximate to 20% of the binding data in PDBbind can find their counterparts in ChEMBL. Thus, the PDBbind collection of binding data is largely complementary to the ChEMBL collection. We also reveal two reasons accounting for the low overlap between two databases: First, only a minor fraction of the protein-ligand complexes in PDBbind is covered by ChEMBL; Second, the literature spaces screened by these two databases do not have a substantial overlap either. The value of focused databases versus more comprehensive ones is demonstrated by our study.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available