☆ 4.7 Article

Hybrid feature selection using component co-occurrence based feature relevance measurement

EXPERT SYSTEMS WITH APPLICATIONS (2018)

期刊

EXPERT SYSTEMS WITH APPLICATIONS

卷 102, 期 -, 页码 83-99

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.eswa.2018.01.041

关键词

Feature selection; Mutual information; Hierarchical agglomerative clustering; Support vector machine; K-nearest neighbor

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science

资金

Beijing Natural Science Foundation [4174105]
Joint Funds of the National Natural Science Foundation of China [U1509214]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Feature selection, which is used to choose a subset of relevant features has attracted considerable attention in recent years. Typical feature selections include: traditional filters, mutual information based methods, clustering based methods and hybrid methods. As many feature selections cannot achieve the best features effectively and efficiently, a new hybrid feature selection method is proposed in this paper. First, the drawbacks of some existing feature relevance measurements are analyzed and a component cooccurrence based feature relevance measurement is proposed. Then, the implementation of the proposed feature selection is given: (1) the samples are preprocessed and two feature subsets are obtained by using two different optimal filters. (2) A feature weight based union operation is proposed to merge the obtained feature subsets. (3) As the hierarchical agglomerative clustering algorithm can produce clusters of high qualities without requiring the cluster number, it is applied to obtain the final feature subset by using a predetermined threshold. In the experiments, two typical classifiers: support vector machine and K-nearest neighbor are used on eight datasets (Lung-cancer, Breast-cancer-wisconsin, Arrhythmia, Arcene, CNAE-9, Madelon, Spambase and KDD-cup-1999), and the 10-cross validation is carried out when the (F1 )measurement is used. Experimental results show that the performance of the proposed feature relevance measurement is superior to those of traditional methods. In addition, the proposed feature selection outperforms many existing typical methods on classification accuracy and execution speed, illustrating its effectiveness in achieving the best features. (C) 2018 Published by Elsevier Ltd.

Hybrid feature selection using component co-occurrence based feature relevance measurement

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Hybrid feature selection using component co-occurrence based feature relevance measurement

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文