4.6 Article

Fair Max-Min Diversity Maximization in Streaming and Sliding-Window Models

Journal

ENTROPY
Volume 25, Issue 7, Pages -

Publisher

MDPI
DOI: 10.3390/e25071066

Keywords

diversity maximization; group fairness; streaming algorithm; sliding-window algorithm

Ask authors/readers for more resources

This paper studies diversity maximization with fairness constraints in streaming and sliding-window models. By designing efficient approximation algorithms, the problem of fair max-min diversity maximization in data streams and sliding-window models is addressed. Experimental results show that our algorithms can run several orders of magnitude faster than existing offline algorithms in streaming and sliding-window settings, while providing comparable solution quality.
Diversity maximization is a fundamental problem with broad applications in data summa-rization, web search, and recommender systems. Given a set X of n elements, the problem asks for a subset S of k << n elements with maximum diversity, as quantified by the dissimilarities among the elements in S. In this paper, we study diversity maximization with fairness constraints in streaming and sliding-window models. Specifically, we focus on the max-min diversity maximization problem, which selects a subset S that maximizes the minimum distance (dissimilarity) between any pair of distinct elements within it. Assuming that the set X is partitioned into m disjoint groups by a specific sensitive attribute, e.g., sex or race, ensuring fairness requires that the selected subset S contains ki elements from each group i ? [m]. Although diversity maximization has been extensively studied, existing algorithms for fair max-min diversity maximization are inefficient for data streams. To address the problem, we first design efficient approximation algorithms for this problem in the (insert-only) streaming model, where data arrive one element at a time, and a solution should be computed based on the elements observed in one pass. Furthermore, we propose approximation algorithms for this problem in the sliding-window model, where only the latest w elements in the stream are considered for computation to capture the recency of the data. Experimental results on real-world and synthetic datasets show that our algorithms provide solutions of comparable quality to the state-of-the-art offline algorithms while running several orders of magnitude faster in the streaming and sliding-window settings.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available