4.8 Article

Cooperative Multi-Agent Reinforcement-Learning-Based Distributed Dynamic Spectrum Access in Cognitive Radio Networks

Journal

IEEE INTERNET OF THINGS JOURNAL
Volume 9, Issue 19, Pages 19477-19488

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JIOT.2022.3168296

Keywords

Cognitive radio networks; cooperative game; decentralized partially observable Markov decision process (Dec-POMDP); deep recurrent Q-network (DRQN); dynamic spectrum access (DSA); Markov game; multi-agent reinforcement learning (MARL)

Funding

  1. National Natural Science Foundation of China [62171449, 61931020]

Ask authors/readers for more resources

This article investigates the distributed dynamic spectrum access problem for multiusers in a cognitive radio network. By utilizing deep recurrent Q-networks and cooperative multi-agent reinforcement learning, a distributed offline training and online execution framework is proposed to maximize network throughput. Experimental results show that the algorithm outperforms the state-of-the-art in terms of successful access rate and collision rate.
With the development of wireless communication and Internet of Things (IoT), there are massive wireless devices that need to share the limited spectrum resources. Dynamic spectrum access (DSA) is a promising paradigm to remedy the problem of inefficient spectrum utilization brought upon by the historical command-and-control approach to spectrum allocation. In this article, we investigate the distributed DSA problem for multiusers in a typical multichannel cognitive radio network. The problem is formulated as a decentralized partially observable Markov decision process (Dec-POMDP), and we propose a centralized off-line training and distributed online execution framework based on cooperative multi-agent reinforcement learning (MARL). We employ the deep recurrent Q-network (DRQN) to address the partial observability of the state for each cognitive user. The ultimate goal is to learn a cooperative strategy which maximizes the sum throughput of a cognitive radio network in a distributed fashion without information exchange between cognitive users. Finally, we validate the proposed algorithm in various settings through extensive experiments. The experimental results show that the proposed CoMARL-DSA algorithm outperforms the state-of-the-art deep Q-learning for spectrum access (DQSA) in terms of successful access rate and collision rate by at least 14% and 12%, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available