3.8 Proceedings Paper

Diversity-Aware k-median: Clustering with Fair Center Representation

Publisher

SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-030-86520-7_47

Keywords

Algorithmic bias; Algorithmic fairness; Diversity-aware clustering; Fair clustering

Funding

  1. Academy of Finland [317085, 325117]
  2. ERC [834862]
  3. EC H2020 RIA project SoBigData [871042]
  4. Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation
  5. European Research Council (ERC) [834862] Funding Source: European Research Council (ERC)
  6. Academy of Finland (AKA) [317085, 325117, 325117] Funding Source: Academy of Finland (AKA)

Ask authors/readers for more resources

The study introduces a novel problem of diversity-aware clustering, where potential cluster centers belong to groups defined by protected attributes. It shows that the diversity-aware k-median problem is NP-hard in general cases but approximation algorithms can be obtained when facility groups are disjoint. Experimentally, approximation methods are evaluated for tractable cases, and a relaxation-based heuristic is provided for theoretically intractable scenarios.
We introduce a novel problem for diversity-aware clustering. We assume that the potential cluster centers belong to a set of groups defined by protected attributes, such as ethnicity, gender, etc. We then ask to find a minimum-cost clustering of the data into k clusters so that a specified minimum number of cluster centers are chosen from each group. We thus require that all groups are represented in the clustering solution as cluster centers, according to specified requirements. More precisely, we are given a set of clients C, a set of facilities F, a collection F = {F-1,...,Ft} of facility groups F-i subset of F, a budget k, and a set of lower-bound thresholds R = {r(1),..,r(t)}, one for each group in The diversity-aware k-median problem asks to find a set S of k facilities in F such that vertical bar S boolean AND F-i vertical bar >= r(i), that is, at least ri centers in S are from group and the k-median cost Sigma(c is an element of C) min(s is an element of S) d(c, s) is minimized. We show that in the general case where the facility groups may overlap, the diversity-aware k-median problem is NP-hard, fixed-parameter intractable with respect to parameter k, and inapproximable to any multiplicative factor. On the other hand, when the facility groups are disjoint, approximation algorithms can be obtained by reduction to the matroid median and redblue median problems. Experimentally, we evaluate our approximation methods for the tractable cases, and present a relaxation-based heuristic for the theoretically intractable case, which can provide high-quality and efficient solutions for real-world datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available