4.7 Article

Reinforcement learning of occupant behavior model for cross-building transfer learning to various HVAC control systems

Journal

ENERGY AND BUILDINGS
Volume 238, Issue -, Pages -

Publisher

ELSEVIER SCIENCE SA
DOI: 10.1016/j.enbuild.2021.110860

Keywords

Thermal comfort; Machine learning; Artificial neural network; Air temperature; Thermostat set point; Q-learning; Building performance simulation

Ask authors/readers for more resources

This study developed a policy-based reinforcement learning model to predict occupant behavior in adjusting thermostat and clothing level, achieving accurate predictions in different types of buildings through transfer learning. The model was trained using Q-learning in MATLAB and successfully transferred behavior knowledge through different HVAC control systems.
Occupant behavior plays an important role in the evaluation of building performance. However, many contextual factors, such as occupancy, mechanical system and interior design, have a significant impact on occupant behavior. Most previous studies have built data-driven behavior models, which have limited scalability and generalization capability. Our investigation built a policy-based reinforcement learning (RL) model for the behavior of adjusting the thermostat and clothing level. Occupant behavior was modelled as a Markov decision process (MDP). The action and state space in the MDP contained occupant behavior and various impact parameters. The goal of the occupant behavior was a more comfortable environment, and we modelled the reward for the adjustment action as the absolute difference in the thermal sensation vote (TSV) before and after the action. We used Q-learning to train the RL model in MATLAB and validated the model with collected data. After training, the model predicted the behavior of adjusting the thermostat set point with R-2 from 0.75 to 0.8, and the mean absolute error (MAE) was less than 1.1 degrees C (2 degrees F) in an office building. This study also transferred the behavior knowledge of the RL model to other office buildings with different HVAC control systems. The transfer learning model predicted the occupant behavior with R-2 from 0.73 to 0.8, and the MAE was less than 1.1 degrees C (2 degrees F) most of the time. Going from office buildings to residential buildings, the transfer learning model also had an R-2 over 0.6. Therefore, the RL model combined with transfer learning was able to predict the building occupant behavior accurately with good scalability, and without the need for data collection. (C) 2021 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available