4.6 Article

Online Streaming Feature Selection via Conditional Independence

期刊

APPLIED SCIENCES-BASEL
卷 8, 期 12, 页码 -

出版社

MDPI
DOI: 10.3390/app8122548

关键词

streaming feature; feature selection; conditional independence; markov blanket

资金

  1. US National Science Foundation (NSF) [1613950, 1763620]
  2. National Natural Science Foundation of China [61772450]
  3. Hebei Provincial Department of education scientific research program of China [QN2016073]
  4. China Postdoctoral Science Foundation [2018M631764]
  5. Hebei Postdoctoral Research Program [B2018003009]
  6. Doctoral Fund of Yanshan University [BL18003, B906]
  7. Hebei Province Natural Science Foundation of China [F2017203307, F2016203290]
  8. Div Of Information & Intelligent Systems
  9. Direct For Computer & Info Scie & Enginr [1613950] Funding Source: National Science Foundation

向作者/读者索取更多资源

Online feature selection is a challenging topic in data mining. It aims to reduce the dimensionality of streaming features by removing irrelevant and redundant features in real time. Existing works, such as Alpha-investing and Online Streaming Feature Selection (OSFS), have been proposed to serve this purpose, but they have drawbacks, including low prediction accuracy and high running time if the streaming features exhibit characteristics such as low redundancy and high relevance. In this paper, we propose a novel algorithm about online streaming feature selection, named Conlnd that uses a three-layer filtering strategy to process streaming features with the aim of overcoming such drawbacks. Through three-layer filtering, i.e., null-conditional independence, single-conditional independence, and multi-conditional independence, we can obtain an approximate Markov blanket with high accuracy and low running time. To validate the efficiency, we implemented the proposed algorithm and tested its performance on a prevalent dataset, i.e., NIPS 2003 and Causality Workbench. Through extensive experimental results, we demonstrated that Conlnd offers significant performance improvements in prediction accuracy and running time compared to Alpha-investing and OSFS. Conlnd offers 5.62% higher average prediction accuracy than Alpha-investing, with a 53.56% lower average running time compared to that for OSFS when the dataset is lowly redundant and highly relevant. In addition, the ratio of the average number of features for Conlnd is 242% less than that for Alpha-investing.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据