☆ 4.6 Article

A reinforcement learning approach for thermostat setpoint preference learning

BUILDING SIMULATION (2023)

Journal

BUILDING SIMULATION

Volume -, Issue -, Pages -

Publisher

TSINGHUA UNIV PRESS

DOI: 10.1007/s12273-023-1056-7

Keywords

reinforcement learning; preference learning; occupant-centric controls; smart thermostats; off-policy learning

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Occupant-centric controls (OCC) is an indoor climate control approach that utilizes occupant feedback to operate building energy systems. This paper introduces a new off-policy reinforcement learning (RL) algorithm that imitates occupant behavior by utilizing unsolicited occupant thermostat overrides. Simulation results show that the RL algorithm can rapidly learn occupant preferences and achieve substantial energy savings, although the impact varies depending on occupants' preferences and thermostat use behavior stochasticity.

Occupant-centric controls (OCC) is an indoor climate control approach whereby occupant feedback is used in the sequence of operation of building energy systems. While OCC has been used in a wide range of building applications, an OCC category that has received considerable research interest is learning occupants' thermal preferences through their thermostat interactions and adapting temperature setpoints accordingly. Many recent studies used reinforcement learning (RL) as an agent for OCC to optimize energy use and occupant comfort. These studies depended on predicted mean vote (PMV) models or constant comfort ranges to represent comfort, while only few of them used thermostat interactions. This paper addresses this gap by introducing a new off-policy reinforcement learning (RL) algorithm that imitates the occupant behaviour by utilizing unsolicited occupant thermostat overrides. The algorithm is tested with a number of synthetically generated occupant behaviour models implemented via the Python API of EnergyPlus. The simulation results indicate that the RL algorithm could rapidly learn preferences for all tested occupant behaviour scenarios with minimal exploration events. While substantial energy savings were observed with most occupant scenarios, the impact on the energy savings varied depending on occupants' preferences and thermostat use behaviour stochasticity.

A reinforcement learning approach for thermostat setpoint preference learning

Journal

BUILDING SIMULATION

Publisher

TSINGHUA UNIV PRESS

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A reinforcement learning approach for thermostat setpoint preference learning

Journal

BUILDING SIMULATION

Publisher

TSINGHUA UNIV PRESS

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper