期刊
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS
卷 19, 期 7, 页码 4535-4548出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TWC.2020.2984758
关键词
Backscatter; Radio frequency; Real-time systems; Performance evaluation; Wireless communication; Time division multiple access; Interference; Symbiotic radio network (SRN); ambient backscatter communication (AmBC); user association; deep rein-forcement learning
资金
- National Natural Science Foundation of China [61631005, U1801261]
- National Key Research and Development Program of China [2018YFB1801105]
- 111 Project [B20064]
- U.S. National Science Foundation [CCF-0939370, CCF-1513915, CCF-1908308]
- [ZYGX2019Z022]
In this paper, we are interested in symbiotic radio networks (SRNs), in which an Internet-of-Things (IoT) network parasitizes in a primary cellular network to achieve spectrum-, energy-, and infrastructure-efficient communications. Each IoT device transmits its own information by backscattering the signals from the primary network without using active radio-frequency (RF) transmitter chain. We consider the symbiosis between the cellular network and the IoT network and focus on the user association problem in SRN. Specifically, the base station (BS) in the primary network serves multiple cellular users using time division multiple access (TDMA) and each IoT device is associated with one cellular user for information transmission. The objective of user association is to link each IoT device to an appropriate cellular user by maximizing the sum rate of all IoT devices. However, the difficulty in obtaining the full real-time channel information makes it difficult to design an optimal policy for this problem. To overcome this issue, we propose two deep reinforcement learning (DRL) algorithms, both use historical information to infer the current information in order to make appropriate decisions. One algorithm, referred to as centralized DRL, makes decisions for all IoT devices at one time with globally available information. The other algorithm, referred to as distributed DRL, makes a decision only for one IoT device at one time using locally available information. Finally, simulation results show that the two proposed DRL algorithms achieve performance comparable to the optimal user association policy which requires perfect real-time information, and the distributed DRL algorithm has the advantage of scalability.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据