4.7 Article

Personalized optimization with user's feedback

Journal

AUTOMATICA
Volume 131, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.automatica.2021.109767

Keywords

Online optimization; Upper-confidence bounds; Gaussian processes; Machine learning; Cyber-physical systems

Funding

  1. Isaac Newton Institute for Mathematical Sciences, Cambridge
  2. National Science Foundation CAREER award [1941896]
  3. National Renewable Energy Laboratory (NREL) [DE-AC36-08GO28308]
  4. Laboratory Directed Research and Development (LDRD) Program at NREL
  5. Div Of Electrical, Commun & Cyber Sys
  6. Directorate For Engineering [1941896] Funding Source: National Science Foundation

Ask authors/readers for more resources

This paper presents an online algorithm for solving an optimization problem with time-varying costs and unknown functions, utilizing Gaussian processes to learn the unknown cost function and design time-varying optimization tools. The algorithm aims to track the optimal trajectory while learning user satisfaction, with consideration for limited computational budgets or real-time implementation.
This paper develops an online algorithm to solve a time-varying optimization problem with an objective that comprises a known time-varying cost and an unknown function. This problem structure arises in a number of engineering systems and cyber-physical systems where the known function captures time-varying engineering costs, and the unknown function models user's satisfaction; in this context, the objective is to strike a balance between given performance metrics and user's satisfaction. Key challenges related to the problem at hand are related to (1) the time variability of the problem, and (2) the fact that learning of the user's utility function is performed concurrently with the execution of the online algorithm. This paper leverages Gaussian processes (GP) to learn the unknown cost function from noisy functional evaluation and build pertinent upper confidence bounds. Using the GP formalism, the paper then advocates time-varying optimization tools to design an online algorithm that exhibits tracking of the oracle-based optimal trajectory within an error ball, while learning the user's satisfaction function with no-regret. The algorithmic steps are inexact, to account for possible limited computational budgets or real-time implementation considerations. Numerical examples are illustrated based on a problem related to vehicle control. (C) 2021 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available