☆ 4.6 Article

Rough set methods in feature selection via submodular function

SOFT COMPUTING (2017)

期刊

SOFT COMPUTING

卷 21, 期 13, 页码 3699-3711

出版社

SPRINGER

DOI: 10.1007/s00500-015-2024-7

关键词

Attribute reduction; Granular computing; Mutual information; Rough set; Submodular function

类别

Computer Science, Artificial Intelligence Computer Science, Interdisciplinary Applications

资金

National Natural Science Foundation of China [61379049]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Attribute reduction is an important problem in data mining and machine learning in that it can highlight favorable features and decrease the risk of over-fitting to improve the learning performance. With this regard, rough sets offer interesting opportunities for this problem. Reduct in rough sets is a subspace of attributes/features which are jointly sufficient and individually necessary to satisfy a certain criterion. Excessive attributes may reduce diversity and increase correlation among features, a lower number of attributes may also receive nearly equal to or even higher classification accuracy in some specific classifiers, which motivates us to address dimensionality reduction problems with attribute reduction from the joint viewpoint of the learning performance and the reduct size. In this paper, we propose a new attribute reduction criterion to select lowest attributes while keeping the best performance of the corresponding learning algorithms to some extent. The main contributions of this work are twofold. First, we define the concept of k-approximate-reduct, instead of the limitation to minimum reduct, which provides an important view to reveal the connection between the size of attribute reduct and the learning performance. Second, a greedy algorithm for attribute reduction problems based on mutual information is developed, and submodular functions are used to analyze its convergence. By the property of diminishing return of the submodularity, there is a solid guarantee for the reasonability of the k-approximate-reduct. It is noted that rough sets serve as an effective tool to evaluate both the marginal and joint probability distributions among attributes in mutual information. Extensive experiments in six real-world public datasets from machine learning repository demonstrate that the selected subset by mutual information reduct comes with higher accuracy with less number of attributes when developing classifiers naive Bayes and radial basis function network.

Rough set methods in feature selection via submodular function

期刊

SOFT COMPUTING

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Rough set methods in feature selection via submodular function

期刊

SOFT COMPUTING

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文