4.4 Article

SOS-SDP: An Exact Solver for Minimum Sum-of-Squares Clustering

期刊

INFORMS JOURNAL ON COMPUTING
卷 34, 期 4, 页码 2144-2162

出版社

INFORMS
DOI: 10.1287/ijoc.2022.1166

关键词

clustering; semidefinite programming; branch and bound

资金

  1. European Union [764759]
  2. Universita degli Studi di Roma Tor Vergata

向作者/读者索取更多资源

This paper proposes an exact algorithm based on the branch-and-bound technique for the minimum sum-of-squares clustering problem. The algorithm computes the lower bound using a cutting-plane procedure and the upper bound using a constrained version of k-means. Instance-level constraints are incorporated in the branch-and-bound procedure to express the relationships between data points.
The minimum sum-of-squares clustering problem (MSSC) consists of partitioning n observations into k clusters in order to minimize the sum of squared distances from the points to the centroid of their cluster. In this paper, we propose an exact algorithm for the MSSC problem based on the branch- and-bound technique. The lower bound is computed by using a cutting-plane procedure in which valid inequalities are iteratively added to the Peng-Wei semidefinite programming (SDP) relaxation. The upper bound is computed with the constrained version of k-means in which the initial centroids are extracted from the solution of the SDP relaxation. In the branch-and-bound procedure, we incorporate instance-level must-link and cannot-link constraints to express knowledge about which data points should or should not be grouped together. We manage to reduce the size of the problem at each level, preserving the structure of the SDP problem itself. To the best of our knowledge, the obtained results show that the approach allows us to successfully solve, for the first time, real-world instances up to 4,000 data points.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据