4.7 Article

Multilabel feature selection using ML-ReliefF and neighborhood mutual information for multilabel neighborhood decision systems

期刊

INFORMATION SCIENCES
卷 537, 期 -, 页码 401-424

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2020.05.102

关键词

Neighborhood rough sets; Multilabel feature selection; ReliefF; Neighborhood mutual information; Multilabel classification

资金

  1. National Natural Science Foundation of China [61772176, 61976082, 61976120, 61672332]
  2. Plan of Scientific Innovation Talent of Henan Province [184100510003]
  3. Young Scholar Program of Henan Province [2017GGJS041]
  4. Natural Science Foundation of Henan Province [162300410178]
  5. Natural Science Foundation of Jiangsu Province [BK20191445]
  6. Six Talent Peaks Project of Jiangsu Province [XYDXXJS-048]
  7. Qing Lan Project of Jiangsu Province

向作者/读者索取更多资源

Feature selection as an essential preprocessing step in multilabel classification has been widely researched. Due to the diversity and complexity of multilabel datasets, some feature selection methods are unstable and yield low predictive accuracy. To address these issues, this paper presents a novel multilabel feature selection method using multilabel ReliefF (ML-ReliefF) and neighborhood mutual information in multilabel neighborhood decision systems. First, to solve the problem of the few available randomly selected samples when searching the nearest samples in ReliefF, the coefficient of difference and the average distance among the nearest similar and heterogeneous samples are introduced to evaluate the differences among the samples, and then the average differences among the similar or heterogeneous samples are defined. Using the Jaccard correlation coefficient, a new formula for updating feature weights is presented. Second, the margin of the sample is studied to granulate all samples under each label, and the concept of the neighborhood is given. By combining algebra with information views, some neighborhood entropy-based uncertainty measures for multilabel classification are investigated, and new neighborhood mutual information is proposed. Furthermore, an optimization objective function is constructed to evaluate the candidate features in multilabel neighborhood decision systems, all the properties are discussed, and the relationships of these measures are established. Finally, an improved ML-ReliefF algorithm is designed for preliminarily eliminating unrelated features to decrease the computational complexity for multilabel classification, and a heuristic forward multilabel feature selection algorithm is developed to remove redundant features and improve classification performance. Experimental results conducted on thirteen multilabel datasets to verify the effectiveness of the proposed algorithms in multilabel neighborhood decision systems are compared with representative methods. (C) 2020 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据