4.8 Article

Semisupervised Deep Reinforcement Learning in Support of IoT and Smart City Services

期刊

IEEE INTERNET OF THINGS JOURNAL
卷 5, 期 2, 页码 624-635

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JIOT.2017.2712560

关键词

Bluetooth low energy indoor localization; deep learning; deep reinforcement learning (DRL); indoor positioning; Internet of Things (IoT); IoT smart services; reinforcement learning; semisupervised deep reinforcement learning; smart city

资金

  1. Qatar National Research Fund [7-1113-1-199]

向作者/读者索取更多资源

Smart services are an important element of the smart cities and the Internet of Things (IoT) ecosystems where the intelligence behind the services is obtained and improved through the sensory data. Providing a large amount of training data is not always feasible; therefore, we need to consider alternative ways that incorporate unlabeled data as well. In recent years, deep reinforcement learning (DRL) has gained great success in several application domains. It is an applicable method for IoT and smart city scenarios where auto-generated data can be partially labeled by users' feedback for training purposes. In this paper, we propose a semisupervised DRL model that fits smart city applications as it consumes both labeled and unlabeled data to improve the performance and accuracy of the learning agent. The model utilizes variational autoencoders as the inference engine for generalizing optimal policies. To the best of our knowledge, the proposed model is the first investigation that extends DRL to the semisupervised paradigm. As a case study of smart city applications, we focus on smart buildings and apply the proposed model to the problem of indoor localization based on Bluetooth low energy signal strength. Indoor localization is the main component of smart city services since people spend significant time in indoor environments. Our model learns the best action policies that lead to a close estimation of the target locations with an improvement of 23% in terms of distance to the target and at least 67% more received rewards compared to the supervised DRL model.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据