4.7 Article

A novel asynchronous deep reinforcement learning model with adaptive early forecasting method and reward incentive mechanism for short-term load forecasting

Journal

ENERGY
Volume 236, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.energy.2021.121492

Keywords

Load forecasting; Deep reinforcement learning; Deep learning; Deep deterministic policy gradient

Funding

  1. National Natural Science Foundation of China [51875503, 51975512]
  2. Zhejiang Nat-ural Science Foundation of China [LZ20E050001]
  3. Zhejiang Key R & D Project of China [2021C03153]

Ask authors/readers for more resources

This study proposes a novel asynchronous deep reinforcement learning model for short-term load forecasting to address the challenges of high temporal correlation and high convergence instability. By introducing new methods to disrupt temporal correlation, adaptively judge training situation, and stabilize model training convergence by considering action trends, the proposed model achieves higher forecasting accuracy, less time cost, and more stable convergence compared to eleven baseline models.
Accurate load forecasting is challenging due to the significant uncertainty of load demand. Deep reinforcement learning, which integrates the nonlinear fitting ability of deep learning with the decision making ability of reinforcement learning, has obtained effective solutions to various optimization problems. However, no study has been reported, which used deep reinforcement learning for short-term load forecasting because of the difficulties in handling the high temporal correlation and high convergence instability. In this study, a novel asynchronous deep reinforcement learning model is proposed for short-term load forecasting by addressing the above difficulties. First, a new asynchronous deep deterministic policy gradient method is proposed to disrupt the temporal correlation of different samples to reduce the overestimation of the expected total discount reward of the agent. Further, a new adaptive early forecasting method is proposed to reduce the time cost of model training by adaptively judging the training situation of the agent. Moreover, a new reward incentive mechanism is proposed to stabilize the convergence of model training by taking into account the trend of agent actions at different time steps. The experimental results show that the proposed model achieves higher forecasting accuracy, less time cost, and more stable convergence compared with eleven baseline models. (c) 2021 Published by Elsevier Ltd.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available