4.6 Article Proceedings Paper

Efficient Algorithm for the k-Means Problem with Must-Link and Cannot-Link Constraints

期刊

TSINGHUA SCIENCE AND TECHNOLOGY
卷 28, 期 6, 页码 1050-1062

出版社

TSINGHUA UNIV PRESS
DOI: 10.26599/TST.2022.9010056

关键词

Heuristic algorithms; Clustering algorithms; Data science; Approximation algorithms; Iterative algorithms; Artificial intelligence; Convergence; Constrained $k$-means; Must-Link (ML) and Cannot-Link (CL) constraints; approximation algorithm; constrained clustering

向作者/读者索取更多资源

Constrained clustering with instance-level Must-Link (ML) and Cannot-Link (CL) auxiliary information has been extensively studied due to its broad applications. However, no algorithm has provided a non-trivial approximation ratio to the constrained k-means problem. To address this issue, we propose an algorithm with a provable approximation ratio of O(log k/) when only ML constraints are considered. Experimental results show that our algorithm outperforms existing greedy-based heuristic methods in clustering accuracy.
Constrained clustering, such as k -means with instance-level Must-Link (ML) and Cannot-Link (CL) auxiliary information as the constraints, has been extensively studied recently, due to its broad applications in data science and AI. Despite some heuristic approaches, there has not been any algorithm providing a non-trivial approximation ratio to the constrained k -means problem. To address this issue, we propose an algorithm with a provable approximation ratio of O(log k/ when only ML constraints are considered. We also empirically evaluate the performance of our algorithm on real-world datasets having artificial ML and disjoint CL constraints. The experimental results show that our algorithm outperforms the existing greedy-based heuristic methods in clustering accuracy.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据