4.7 Article

Efficient Incremental Offline Reinforcement Learning With Sparse Broad Critic Approximation

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TSMC.2023.3305498

Keywords

Broad learning system (BLS); incremental critic design; linear critic approximation (LCA); offline reinforcement learning (ORL); variable force tracking

Ask authors/readers for more resources

Offline reinforcement learning (ORL) has gained attention in robot learning due to its ability to learn policies directly from precollected samples. This article proposes a novel incremental ORL approach called sparse broad critic approximation (BORL) that combines the advantages of broad learning system (BLS) and API methods. Simulation studies demonstrate that BORL achieves comparable or better performance than conventional methods without hyperparameter fine-tuning.
Offline reinforcement learning (ORL) has been getting increasing attention in robot learning, benefiting from its ability to avoid hazardous exploration and learn policies directly from precollected samples. Approximate policy iteration (API) is one of the most commonly investigated ORL approaches in robotics, due to its linear representation of policies, which makes it fairly transparent in both theoretical and engineering analysis. One open problem of API is how to design efficient and effective basis functions. The broad learning system (BLS) has been extensively studied in supervised and unsupervised learning in various applications. However, few investigations have been conducted on ORL. In this article, a novel incremental ORL approach with sparse broad critic approximation (BORL) is proposed with the advantages of BLS, which approximates the critic function in a linear manner with randomly projected sparse and compact features and dynamically expands its broad structure. The BORL is the first extension of API with BLS in the field of robotics and ORL. The approximation ability and convergence performance of BORL are also analyzed. Comprehensive simulation studies are then conducted on two benchmarks, and the results demonstrate that the proposed BORL can obtain comparable or better performance than conventional API methods without laborious hyperparameter fine-tuning work. To further demonstrate the effectiveness of BORL in practical robotic applications, a variable force tracking problem in robotic ultrasound scanning (RUSS) is investigated, and a learning-based adaptive impedance control (LAIC) algorithm is proposed based on BORL. The experimental results demonstrate the advantages of LAIC compared with conventional force tracking methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available