4.7 Article

Cluster Purging: Efficient Outlier Detection Based on Rate-Distortion Theory

期刊

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2021.3103571

关键词

Outlier detection; clustering algorithms; rate-distortion theory

向作者/读者索取更多资源

Rate-distortion theory-based outlier detection utilizes good data compression to encode outliers with unique symbols. We propose Cluster Purging as an extension of clustering-based outlier detection, allowing the assessment of clustering representivity and the identification of data best represented by individual unique clusters. We present two efficient algorithms for Cluster Purging, one parameter-free and the other allowing tuning in supervised setups.
Rate-distortion theory-based outlier detection builds upon the rationale that a good data compression will encode outliers with unique symbols. Based on this rationale, we propose Cluster Purging, which is an extension of clustering-based outlier detection. This extension allows one to assess the representivity of clusterings, and to find data that are best represented by individual unique clusters. We propose two efficient algorithms for performing Cluster Purging, one being parameter-free, while the other algorithm has a parameter that controls representivity estimations, allowing it to be tuned in supervised setups. In an experimental evaluation, we show that Cluster Purging improves upon outliers detected from raw clusterings, and that Cluster Purging competes strongly against state-of-the-art alternatives.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据