4.7 Article

A novel self-learning feature selection approach based on feature attributions

期刊

EXPERT SYSTEMS WITH APPLICATIONS
卷 183, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2021.115219

关键词

Feature selection; Local search; Feature attribution; Self-learning

资金

  1. National Key R&D Program of China [2019YFB1704402]

向作者/读者索取更多资源

Feature selection plays a crucial role in improving the accuracy and generalization of machine learning models, especially for high-dimensional data tasks. In this study, a novel self-learning feature selection approach based on feature attributions was proposed, showing improved search efficiency for optimal feature subset selection. Experimental results demonstrated the effectiveness of the SLFS approach in achieving optimal subsets with fewer iterations and utilizing SHAP values for enhanced search efficiency.
Feature selection has shown its effectiveness in improving the accuracy and generalization of machine learning models, especially for those tasks with high-dimensional data. In this article, a novel self-learning feature selection (SLFS) approach based on feature attributions is proposed as a wrapper method, which has higher search efficiency for optimal feature subsets with three main improvements. First, we regard feature selection as a combinatorial optimization problem and propose a unified local search framework for wrapper methods by analyzing meta-heuristic algorithms in feature selection. Second, for the binary search space of feature selection, we propose two types of neighborhood structures, namely, ring-type and line-type structures, for the local search framework. Third, we focus on feature attribution methods, such as SHAP (SHapley Additive explanations) (Lundberg & Lee, 2017), which can interpret each feature's importance to predictions. In each iteration, we adopt SHAP values and other attributes from previous subsets to guide the next selection of new subsets. To validate the performance of our SLFS approach, we collected 16 classification datasets from the UCI repository for comparison with other meta-heuristic wrapper approaches in terms of fitness, accuracy, F1 scores and selection ratios. The experimental results show that the SLFS approach can be used to obtain an optimal subset with fewer iterations and a small population, and SHAP values play a role in improving search efficiency.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据