4.5 Article

Outlier-eliminated k-means clustering algorithm based on differential privacy preservation

Journal

APPLIED INTELLIGENCE
Volume 45, Issue 4, Pages 1179-1191

Publisher

SPRINGER
DOI: 10.1007/s10489-016-0813-z

Keywords

Differential privacy (DP) preservation; k-means clustering; Outlier; OEDP

Funding

  1. National Natural Science Foundation of China [61370050]
  2. Natural Science Foundation of Anhui Province [1508085QF134]

Ask authors/readers for more resources

Individual privacy may be compromised during the process of mining for valuable information, and the potential for data mining is hindered by the need to preserve privacy. It is well known that k-means clustering algorithms based on differential privacy require preserving privacy while maintaining the availability of clustering. However, it is difficult to balance both aspects in traditional algorithms. In this paper, an outlier-eliminated differential privacy (OEDP) k-means algorithm is proposed that both preserves privacy and improves clustering efficiency. The proposed approach selects the initial centre points in accordance with the distribution density of data points, and adds Laplacian noise to the original data for privacy preservation. Both a theoretical analysis and comparative experiments were conducted. The theoretical analysis shows that the proposed algorithm satisfies epsilon-differential privacy. Furthermore, the experimental results show that, compared to other methods, the proposed algorithm effectively preserves data privacy and improves the clustering results in terms of accuracy, stability, and availability.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available