☆ 4.7 Article

Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

卷 34, 期 3, 页码 1454-1464

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNNLS.2021.3105407

关键词

Task analysis; Training; Trajectory; Learning systems; Heuristic algorithms; Feature extraction; Benchmark testing; Meta-learning; reinforcement learning (RL); task adaptiveness

类别

Computer Science, Artificial Intelligence Computer Science, Hardware & Architecture Computer Science, Theory & Methods Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This article introduces an off-policy meta-RL algorithm that allows meta-learners to adapt their exploration strategy and balance task-agnostic and task-related information through latent context reorganization.

Deep reinforcement learning is confronted with problems of sampling inefficiency and poor task migration capability. Meta-reinforcement learning (meta-RL) enables meta-learners to utilize the task-solving skills trained on similar tasks and quickly adapt to new tasks. However, meta-RL methods lack enough queries toward the relationship between task-agnostic exploitation of data and task-related knowledge introduced by latent context, limiting their effectiveness and generalization ability. In this article, we develop an algorithm for off-policy meta-RL that can provide the meta-learners with self-oriented cognition toward how they adapt to the family of tasks. In our approach, we perform dynamic task-adaptiveness distillation to describe how the meta-learners adjust the exploration strategy in the meta-training process. Our approach also enables the meta-learners to balance the influence of task-agnostic self-oriented adaption and task-related information through latent context reorganization. In our experiments, our method achieves 10%-20% higher asymptotic reward than probabilistic embeddings for actor-critic RL (PEARL).

Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文