Journal
ENERGY
Volume 236, Issue -, Pages -Publisher
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.energy.2021.121492
Keywords
Load forecasting; Deep reinforcement learning; Deep learning; Deep deterministic policy gradient
Categories
Funding
- National Natural Science Foundation of China [51875503, 51975512]
- Zhejiang Nat-ural Science Foundation of China [LZ20E050001]
- Zhejiang Key R & D Project of China [2021C03153]
Ask authors/readers for more resources
This study proposes a novel asynchronous deep reinforcement learning model for short-term load forecasting to address the challenges of high temporal correlation and high convergence instability. By introducing new methods to disrupt temporal correlation, adaptively judge training situation, and stabilize model training convergence by considering action trends, the proposed model achieves higher forecasting accuracy, less time cost, and more stable convergence compared to eleven baseline models.
Accurate load forecasting is challenging due to the significant uncertainty of load demand. Deep reinforcement learning, which integrates the nonlinear fitting ability of deep learning with the decision making ability of reinforcement learning, has obtained effective solutions to various optimization problems. However, no study has been reported, which used deep reinforcement learning for short-term load forecasting because of the difficulties in handling the high temporal correlation and high convergence instability. In this study, a novel asynchronous deep reinforcement learning model is proposed for short-term load forecasting by addressing the above difficulties. First, a new asynchronous deep deterministic policy gradient method is proposed to disrupt the temporal correlation of different samples to reduce the overestimation of the expected total discount reward of the agent. Further, a new adaptive early forecasting method is proposed to reduce the time cost of model training by adaptively judging the training situation of the agent. Moreover, a new reward incentive mechanism is proposed to stabilize the convergence of model training by taking into account the trend of agent actions at different time steps. The experimental results show that the proposed model achieves higher forecasting accuracy, less time cost, and more stable convergence compared with eleven baseline models. (c) 2021 Published by Elsevier Ltd.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available