期刊
JOURNAL OF CHEMINFORMATICS
卷 8, 期 -, 页码 -出版社
BMC
DOI: 10.1186/s13321-016-0176-9
关键词
Chemical space; Data mining; Molecular fingerprints; Molecular scaffolds; Physicochemical properties; Shannon entropy; Structural diversity
类别
资金
- Universidad Nacional Autonoma de Mexico (UNAM) [PAPIME PE200116]
- Programa de Apoyo a la Investigacion y el Posgrado (PAIP), Facultad de Quimica, UNAM [5000-9163]
- Institutional Program Nuevas Alternativas de Tratamiento para Enfermedades Infecciosas (NUATEI) of the Instituto de Investigaciones Biomedicas (IIB-UNAM)
- CONACyT [660465/576637]
Background: Measuring the structural diversity of compound databases is relevant in drug discovery and many other areas of chemistry. Since molecular diversity depends on molecular representation, comprehensive chemoinformatic analysis of the diversity of libraries uses multiple criteria. For instance, the diversity of the molecular libraries is typically evaluated employing molecular scaffolds, structural fingerprints, and physicochemical properties. However, the assessment with each criterion is analyzed independently and it is not straightforward to provide an evaluation of the global diversity. Results: Herein the Consensus Diversity Plot (CDP) is proposed as a novel method to represent in low dimensions the diversity of chemical libraries considering simultaneously multiple molecular representations. We illustrate the application of CDPs to classify eight compound data sets and two subsets with different sizes and compositions using molecular scaffolds, structural fingerprints, and physicochemical properties. Conclusions: CDPs are general data mining tools that represent in two-dimensions the global diversity of compound data sets using multiple metrics. These plots can be constructed using single or combined measures of diversity. An online version of the CDPs is freely available at: https://consensusdiversityplots-difacquim-unam.shinyapps.io/RscriptsCDPlots/.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据