4.7 Article

Deep Reinforcement Learning for Simultaneous Sensing and Channel Access in Cognitive Networks

期刊

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS
卷 22, 期 7, 页码 4930-4946

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TWC.2022.3230872

关键词

Cognitive radio networks; deep reinforcement learning; dynamic spectrum access; wireless channels

向作者/读者索取更多资源

This paper investigates the problem of dynamic spectrum access (DSA) in cognitive wireless networks, where secondary users (SUs) can only obtain partial observations due to narrowband sensing and transmissions. The objective is to maximize the SU's long-term throughput by developing a novel algorithm called Double Deep Q-network for Sensing and Access (DDQSA) that learns both access and sensing policies via deep Q-learning. The proposed algorithm achieves near-optimal performance and outperforms existing approaches in certain scenarios.
We consider the problem of dynamic spectrum access (DSA) in cognitive wireless networks, consisting of primary users (PUs) and secondary users (SUs), where only partial observations are available at the SUs due to narrowband sensing and transmissions. The network operates in a time-slotted regime, where the traffic patterns of the PUs are modeled as finite-memory Markov chains, that are unknown to the SUs. Since observations are partial, then both channel sensing and access actions affect the throughput. Focusing on the case in which there is a single SU, our objective is to maximize the SU's long-term throughput. To that aim, we develop a novel algorithm that learns both access and sensing policies via deep Q-learning, dubbed Double Deep Q-network for Sensing and Access (DDQSA). To the best of our knowledge, this is the first work that jointly optimizes both sensing and access policies for DSA via deep Q-learning. Next, we consider wireless networks with access policy which implements a fixed channel hopping dynamics, for which we analytically determine the optimal SU sensing and access policy and its associated throughput. Then, we demonstrate that indeed, the proposed DDQSA algorithm can achieve near-optimal performance for the considered network. Our results show that the proposed DDQSA algorithm learns a policy that implements both sensing and channel access, which significantly outperforms existing approaches, and can achieve the optimal performance in certain scenarios.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据