4.4 Article

A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities

期刊

JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
卷 32, 期 2, 页码 375-384

出版社

SPRINGER
DOI: 10.1007/s10822-017-0094-6

关键词

Semi-supervised; Feature selection; Fisher criterion; Graph Laplacian; QSAR models

资金

  1. Hematology and Oncology Research Center of Shahid Sadoughi University of Medical Sciences [5666]

向作者/读者索取更多资源

Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据