4.4 Article

ON CORESETS FOR k-MEDIAN AND k-MEANS CLUSTERING IN METRIC AND EUCLIDEAN SPACES AND THEIR APPLICATIONS

Journal

SIAM JOURNAL ON COMPUTING
Volume 39, Issue 3, Pages 923-947

Publisher

SIAM PUBLICATIONS
DOI: 10.1137/070699007

Keywords

k-median clustering; k-means clustering; coreset; random sampling; high dimensions; approximation algorithms

Ask authors/readers for more resources

We present new approximation algorithms for the k-median and k-means clustering problems. To this end, we obtain small coresets for k-median and k-means clustering in general metric spaces and in Euclidean spaces. In R-d, these coresets are of size with polynomial dependency on the dimension d. This leads to (1 + epsilon)-approximation algorithms to the optimal k-median and k-means clustering in Rd, with running time O(ndk + 2((k/epsilon)O(1)) d(2) log(k+2) n), where n is the number of points. This improves over previous results. We use those coresets to maintain a (1 + epsilon)-approximate k-median and k-means clustering of a stream of points in R-d, using O(d(2)k(2)epsilon(-2) log(8) n) space. These are the first streaming algorithms, for those problems, that have space complexity with polynomial dependency on the dimension.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available