☆ 4.7 Article

Reinforcement Learning for Joint Optimization of Multiple Rewards

JOURNAL OF MACHINE LEARNING RESEARCH (2023)

Journal

JOURNAL OF MACHINE LEARNING RESEARCH

Volume 24, Issue -, Pages -

Publisher

MICROTOME PUBL

Keywords

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The paper discusses the problem of optimizing a non-linear function of the long term average rewards and proposes model-based and model-free algorithms to solve it. The proposed algorithms are shown to outperform conventional RL approaches in fairness scheduling and queueing system scheduling.

Finding optimal policies which maximize long term rewards of Markov Decision Processes requires the use of dynamic programming and backward induction to solve the Bellman optimality equation. However, many real-world problems require optimization of an objective that is non-linear in cumulative rewards for which dynamic programming cannot be applied directly. For example, in a resource allocation problem, one of the objectives is to maximize long-term fairness among the users. We notice that when an agent aim to optimize some function of the sum of rewards is considered, the problem loses its Markov nature. This paper addresses and formalizes the problem of optimizing a non-linear function of the long term average of rewards. We propose model-based and model-free algorithms to learn the policy, where the model-based policy is shown to achieve a regret of O similar to LKDS for K objectives combined with a concave L-Lipschitz function. Further, using the fairness in cellular base-station scheduling, and queueing system scheduling as examples, the proposed algorithm is shown to significantly outperform the conventional RL approaches.

Reinforcement Learning for Joint Optimization of Multiple Rewards

Journal

JOURNAL OF MACHINE LEARNING RESEARCH

Publisher

MICROTOME PUBL

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Reinforcement Learning for Joint Optimization of Multiple Rewards

Journal

JOURNAL OF MACHINE LEARNING RESEARCH

Publisher

MICROTOME PUBL

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper