4.7 Article

Discussion of measures of enrichment in virtual screening: Comparing the information content of descriptors with increasing levels of sophistication

向作者/读者索取更多资源

We have performed virtual screening using some very simple features, by employing the number of atoms per element as molecular descriptors but without regard to any structural information whatsoever. Surprisingly, these atom counts are able to outperform virtual-affinity-based fingerprints and Unity fingerprints in some activity classes. Although molecular weight and other biases were known in target-based virtual screening settings (docking), we report the effect of using very simple descriptors for ligand-based virtual screening, by using clearly defined biological targets and employing a large data set (> 100 000 compounds) containing multiple (11) activity classes. Structure-unaware atom count vectors as descriptors in combination with the Euclidean distance measure are able to achieve enrichment factors over random selection of around 4 (depending on the particular class of active compounds), putting the enrichment factors reported for more sophisticated virtual screening methods in a different light. They are also able to retrieve active compounds with novel scaffolds instead of merely the expected structural analogues. The added value of many currently used virtual screening methods (calculated as enrichment factors) drops down to a factor of between I and 2, instead of often reported double-digit figures. The observed effect is much less profound for simple descriptors such as molecular weight and is only present in cases of atypical (larger) ligands. The current state of virtual screening is not as sophisticated as might be expected, which is due to descriptors still not being able to capture structural properties relevant to binding. This fact can partly be explained by highly nonlinear structure-activity relationships, which represent a severe limitation of the similar property principle in the context of bioactivity.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据