☆ 4.7 Article

Human-aligned trading by imitative multi-loss reinforcement learning

EXPERT SYSTEMS WITH APPLICATIONS (2023)

期刊

EXPERT SYSTEMS WITH APPLICATIONS

卷 234, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.eswa.2023.120939

关键词

Algorithmic trading; Reinforcement learning; Deep Q network; Imitation learning; Human alignment

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Research on algorithmic trading using reinforcement learning has gained popularity in recent years. In this paper, we propose a trading model that aims to align machine trading agents with human traders. We introduce a novel multi-loss function combining supervised learning, single-step and multi-step Q learning, and incorporate imitation learning in the training and trading processes. Our model outperforms baseline models and justifies the inclusion of individual model features to align with human trader behavior.

Research into algorithmic trading using reinforcement learning has been garnering increasing popularity in recent years. While most research work focuses on solving a certain modelling problem or data problem with positive results, we believe that in an application as critical as financial trading, aligning the machine to human behaviours is imperative and should be regarded as the basis of all further improvements before machine algorithms are free to go their own innovative ways. In this paper, we are proposing a trading model whose design principles are based on bringing a machine trading agent close to a human trader. We study areas where human alignment is necessary and introduce as a solution a novel multi-loss function of the model combining supervised learning, single-step and multi-step Q learning, and also inject the paradigm of imitation learning in the training and trading processes. We also introduce a realistic backtesting setup and a holding position aware profit calculation scheme under which the machine algorithm conducts intra-day trading using minute tick data over a group of U. S. stocks chosen to represent different industrial sectors and liquidity levels. Our model's overall out-performance over a group of baseline models as well as our ablation study results justify the inclusion of individual model features all of which are introduced to bring aspects of the model behaviour more aligned with those of a human trader.

Human-aligned trading by imitative multi-loss reinforcement learning

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Human-aligned trading by imitative multi-loss reinforcement learning

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文