☆ 3.8 Proceedings Paper

Scaling Multi-Armed Bandit Algorithms

KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING (2019)

Journal

KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING

Volume -, Issue -, Pages 1449-1459

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3292500.3330862

Keywords

Bandit Algorithms; Thompson Sampling; Adaptive Windowing; Data Stream Monitoring; Predictive Maintenance

Funding

DFG [2153]
German Federal Ministry of Education and Research (BMBF) [01IS17042]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The Multi-Armed Bandit (MAB) is a fundamental model capturing the dilemma between exploration and exploitation in sequential decision making. At every time step, the decision maker selects a set of arms and observes a reward from each of the chosen arms. In this paper, we present a variant of the problem, which we call the Scaling MAB (S-MAB): The goal of the decision maker is not only to maximize the cumulative rewards, i.e., choosing the arms with the highest expected reward, but also to decide how many arms to select so that, in expectation, the cost of selecting arms does not exceed the rewards. This problem is relevant to many real-world applications, e.g., online advertising, financial investments or data stream monitoring. We propose an extension of Thompson Sampling, which has strong theoretical guarantees and is reported to perform well in practice. Our extension dynamically controls the number of arms to draw. Furthermore, we combine the proposed method with ADWIN, a state-of-the-art change detector, to deal with non-static environments. We illustrate the benefits of our contribution via a real-world use case on predictive maintenance.

Scaling Multi-Armed Bandit Algorithms

Journal

KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Scaling Multi-Armed Bandit Algorithms

Journal

KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper