☆ 4.7 Article

A reliable adaptive prototype-based learning for evolving data streams with limited labels

INFORMATION PROCESSING & MANAGEMENT (2024)

期刊

INFORMATION PROCESSING & MANAGEMENT

卷 61, 期 1, 页码 -

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.ipm.2023.103532

关键词

Data streams; Data-driven prototypes; Concept drift; Concept evolution; Semi-supervised classification

类别

Computer Science, Information Systems Information Science & Library Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Data stream mining faces challenges of concept drift and evolution. Existing learning algorithms require class labels for all data points, but the rapid pace of data streams often leads to label scarcity. To address this, we propose an adaptive, data-driven, prototype-based semi-supervised learning framework that uses dynamic prototypes to handle evolving data streams and achieve improved data abstraction and detection of novel classes.

Data stream mining presents notable challenges in the form of concept drift and evolution. Existing learning algorithms, typically designed within a supervised learning framework, require class labels for all data points. However, this is an impractical requirement given the rapid pace of data streams, which often results in label scarcity. Recognizing the realistic necessity of learning from data streams with limited labels, we propose an adaptive, data-driven, prototype-based semi-supervised learning framework specifically tailored to handle evolving data streams. Our method employs a prototype-based data representation, summarizing the continuous flow of streaming data using dynamic prototypes at varying levels of granularity. This technique enables improved data abstraction, capturing the underlying local data distributions more accurately. The model also incorporates reliability modeling and efficient emerging class discovery, dynamically updating the significance of prototypes over time and swiftly adapting to local concept drift. We further leverage these adaptive prototypes to intuitively detect concept evolution, i.e., identifying novel classes from a local density perspective. To minimize the need for manual labeling while optimizing performance, we incorporate active learning into our method. This method employs a dual-criteria approach for data point selection, considering both uncertainty and local density. These manually labeled data points, together with unlabeled data, serve to update the model efficiently and robustly. Empirical validation using several bench-mark datasets demonstrates promising performance in comparison to existing state-of-the-art techniques.

A reliable adaptive prototype-based learning for evolving data streams with limited labels

期刊

INFORMATION PROCESSING & MANAGEMENT

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A reliable adaptive prototype-based learning for evolving data streams with limited labels

期刊

INFORMATION PROCESSING & MANAGEMENT

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文