4.7 Article

Adaptive switching for communication profiles in underwater acoustic modems based on reinforcement learning

Journal

APPLIED ACOUSTICS
Volume 210, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.apacoust.2023.109430

Keywords

Underwater acoustic communications; Adaptive communications; Reinforcement learning; DDPG

Categories

Ask authors/readers for more resources

This work investigates the adaptation of communication strategies to the dynamics of the underwater acoustic (UWA) channel. Three communication strategies, including Frequency-Hopped Binary Frequency Shift Keying (FH-BFSK), Single-Carrier (SC) communication, and multicarrier communication, are considered. A reinforcement learning algorithm, the Deep Deterministic Policy Gradient (DDPG) method with a Gumbel-softmax scheme, is employed for intelligent and adaptive switching among these strategies. Simulation and experimental results demonstrate that the proposed method outperforms random selection and direct feedback methods in time-varying channels.
The underwater acoustic (UWA) channel is a complex and stochastic process with large spatial and temporal dynamics. This work studies the adaptation of the communication strategy to channel dynamics. Specifically, a set of communication strategies are considered, including Frequency-Hopped Binary Frequency Shift Keying (FH-BFSK), Single-Carrier (SC) communication, and multicarrier communication. Based on the channel condition, a reinforcement learning (RL) algorithm, the Deep Deterministic Policy Gradient (DDPG) method along with a Gumbel-softmax scheme is employed for intelligent and adaptive switching among those communication strategies. The adaptive switching is performed on a transmission block-by-block basis, with the goal of maximizing long-term system performance. The reward function is defined based on the energy efficiency (EE) and the spectral efficiency (SE) of the communication strategies. Simulation results and experimental data processing results reveal that the proposed method outperforms a random selection method and a direct feedback method in time-varying channels.(c) 2023 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available