4.6 Article

Utility-efficient differentially private K-means clustering based on cluster merging

期刊

NEUROCOMPUTING
卷 424, 期 -, 页码 205-214

出版社

ELSEVIER
DOI: 10.1016/j.neucom.2020.10.051

关键词

K-means; Cluster; Differential privacy

资金

  1. Special Fund for Key Program of Science and Technology of Anhui Province, China [18030901027]
  2. Support Program for Outstanding Young Talents in Anhui Universities [gxyq2019001]
  3. National Natural Science Foundation of China [11301002, 61572031]
  4. Anhui Provincial Natural Science Foundation [2008085MF187]
  5. Natural Science Foundation for the Higher Education Institutions of Anhui Province of China [KJ2018A0017]

向作者/读者索取更多资源

The paper introduces a novel differentially private k-means clustering algorithm, DP-KCCM, which improves the utility of clustering significantly by adding adaptive noise and merging clusters. The algorithm first generates initial centroids, adds adaptive noise, and further improves the utility by merging clusters.
Differential privacy is widely used in data analysis. State-of-the-art k-means clustering algorithms with differential privacy typically add an equal amount of noise to centroids for each iterative computation. In this paper, we propose a novel differentially private k-means clustering algorithm, DP-KCCM, that significantly improves the utility of clustering by adding adaptive noise and merging clusters. Specifically, to obtain k clusters with differential privacy, the algorithm first generates n x k initial centroids, adds adaptive noise for each iteration to get n x k clusters, and finally merges these clusters into k ones. We theoretically prove the differential privacy of the proposed algorithm. Surprisingly, extensive experimental results show that: 1) cluster merging with equal amounts of noise improves the utility somewhat; 2) while adding adaptive noise only does not improve the utility, combining both cluster merging and adaptive noise further improves the utility significantly. (C) 2020 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据