4.6 Article

A reinforcement learning approach for thermostat setpoint preference learning

Journal

BUILDING SIMULATION
Volume -, Issue -, Pages -

Publisher

TSINGHUA UNIV PRESS
DOI: 10.1007/s12273-023-1056-7

Keywords

reinforcement learning; preference learning; occupant-centric controls; smart thermostats; off-policy learning

Ask authors/readers for more resources

Occupant-centric controls (OCC) is an indoor climate control approach that utilizes occupant feedback to operate building energy systems. This paper introduces a new off-policy reinforcement learning (RL) algorithm that imitates occupant behavior by utilizing unsolicited occupant thermostat overrides. Simulation results show that the RL algorithm can rapidly learn occupant preferences and achieve substantial energy savings, although the impact varies depending on occupants' preferences and thermostat use behavior stochasticity.
Occupant-centric controls (OCC) is an indoor climate control approach whereby occupant feedback is used in the sequence of operation of building energy systems. While OCC has been used in a wide range of building applications, an OCC category that has received considerable research interest is learning occupants' thermal preferences through their thermostat interactions and adapting temperature setpoints accordingly. Many recent studies used reinforcement learning (RL) as an agent for OCC to optimize energy use and occupant comfort. These studies depended on predicted mean vote (PMV) models or constant comfort ranges to represent comfort, while only few of them used thermostat interactions. This paper addresses this gap by introducing a new off-policy reinforcement learning (RL) algorithm that imitates the occupant behaviour by utilizing unsolicited occupant thermostat overrides. The algorithm is tested with a number of synthetically generated occupant behaviour models implemented via the Python API of EnergyPlus. The simulation results indicate that the RL algorithm could rapidly learn preferences for all tested occupant behaviour scenarios with minimal exploration events. While substantial energy savings were observed with most occupant scenarios, the impact on the energy savings varied depending on occupants' preferences and thermostat use behaviour stochasticity.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available