4.6 Article

Utility-efficient differentially private K-means clustering based on cluster merging

Journal

NEUROCOMPUTING
Volume 424, Issue -, Pages 205-214

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2020.10.051

Keywords

K-means; Cluster; Differential privacy

Funding

  1. Special Fund for Key Program of Science and Technology of Anhui Province, China [18030901027]
  2. Support Program for Outstanding Young Talents in Anhui Universities [gxyq2019001]
  3. National Natural Science Foundation of China [11301002, 61572031]
  4. Anhui Provincial Natural Science Foundation [2008085MF187]
  5. Natural Science Foundation for the Higher Education Institutions of Anhui Province of China [KJ2018A0017]

Ask authors/readers for more resources

The paper introduces a novel differentially private k-means clustering algorithm, DP-KCCM, which improves the utility of clustering significantly by adding adaptive noise and merging clusters. The algorithm first generates initial centroids, adds adaptive noise, and further improves the utility by merging clusters.
Differential privacy is widely used in data analysis. State-of-the-art k-means clustering algorithms with differential privacy typically add an equal amount of noise to centroids for each iterative computation. In this paper, we propose a novel differentially private k-means clustering algorithm, DP-KCCM, that significantly improves the utility of clustering by adding adaptive noise and merging clusters. Specifically, to obtain k clusters with differential privacy, the algorithm first generates n x k initial centroids, adds adaptive noise for each iteration to get n x k clusters, and finally merges these clusters into k ones. We theoretically prove the differential privacy of the proposed algorithm. Surprisingly, extensive experimental results show that: 1) cluster merging with equal amounts of noise improves the utility somewhat; 2) while adding adaptive noise only does not improve the utility, combining both cluster merging and adaptive noise further improves the utility significantly. (C) 2020 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available