期刊
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
卷 70, 期 2, 页码 1851-1865出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TVT.2021.3055065
关键词
Reinforcement learning; Delays; Wireless sensor networks; Protocols; Internet of Things; Synchronization; Energy consumption; The internet of underwater things (IoUT); deep reinforcement learning; asynchronous wake-up scheme; cyclic difference set (CDS)
资金
- Natural Science Foundation of Jiangsu Province [BK20190733]
- NUPTSF [NY219166]
- National Natural Science Foundation of China [61872423]
- Natural Sciences, and Engineering Research Council (NSERC) of Canada [RGPIN-2018-03792]
- InnovateNL SensorTECH [5404-2061-101]
The paper explores the optimal policy selection for sensor nodes in underwater acoustic sensor networks, proposing an adaptive asynchronous wake-up scheme based on deep reinforcement learning and LSTM networks to enhance energy efficiency and network performance.
Underwater acoustic sensor networks (UWSNs), acting as a reliable and efficient infrastructure of the Internet of underwater things (IoUT), have attracted much research interest in recent years due to the wide range of their potential marine applications. The limited energy supply of underwater sensor nodes is a significant challenge that can be mitigated by the cyclic difference set (CDS)-based coordination asynchronous wake-up scheme. However, the CDS-based asynchronous wake-up scheme also introduces long delays in the neighbor discovery that deteriorates packet delay as well as the network lifetime. In this paper, we formulate the problem of policy selection for idle listening as a Markov decision process and exploit the framework of deep reinforcement learning to obtain the optimal policies of underwater sensor nodes. Furthermore, the long short-term memory (LSTM) networks are utilized to estimate the network traffic feature, which can improve the performance of the proposed adaptive asynchronous wake-up scheme. To verify the performance of the proposed scheme, simulations in different network scenarios are conducted with the comparison of random, fixed metric policies, and original CDS-based asynchronous wake-up schemes.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据