4.8 Article

Dynamic Charging Scheme Problem With Actor-Critic Reinforcement Learning

Journal

IEEE INTERNET OF THINGS JOURNAL
Volume 8, Issue 1, Pages 370-380

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JIOT.2020.3005598

Keywords

Actor-critic reinforcement learning (ACRL); charging scheme; mobile charger (MC); wireless rechargeable sensor networks (WRSNs)

Funding

  1. National Natural Science Foundation of China [61572113, 61877009]
  2. Fundamental Research Funds for the Central Universities [ZYGX2019J075]

Ask authors/readers for more resources

In this article, a novel dynamic charging scheme based on actor-critic reinforcement learning algorithm in WRSN is proposed. The use of GRUs to capture the relationships of charging actions in time sequence is introduced. Extensive simulations show that the proposed ACRL algorithm surpasses heuristic algorithms in average lifetime and tour length.
The energy problem is one of the most important challenges in the application of sensor networks. With the development of wireless charging technology and intelligent mobile charger (MC), the energy problem can be solved by the wireless charging strategy. In the practical application of wireless rechargeable sensor networks (WRSNs), the energy consumption rate of nodes is dynamically changed due to many uncertainties, such as the death and different transmission tasks of sensor nodes. However, existing works focus on on-demand schemes, which not fully consider real-time global charging scheduling. In this article, a novel dynamic charging scheme (DCS) in WRSN based on the actor-critic reinforcement learning (ACRL) algorithm is proposed. In the ACRL, we introduce gated recurrent units (GRUs) to capture the relationships of charging actions in time sequence. Using the actor network with one GRU layer, we can pick up an optimal or near-optimal sensor node from candidates as the next charging target more quickly and speed up the training of the model. Meanwhile, we take the tour length and the number of dead nodes as the reward signal. Actor and critic networks are updated by the error criterion function of R and V. Compared with current on-demand charging scheduling algorithms, extensive simulations show that the proposed ACRL algorithm surpasses heuristic algorithms, such as the Greedy, DP, nearest job next with preemption, and TSCA in the average lifetime and tour length, especially against the size and complexity increasing of WRSNs.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available