Journal
IEEE ACCESS
Volume 9, Issue -, Pages 152310-152321Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3127209
Keywords
Reinforcement learning; Mathematical models; Feature extraction; Heuristic algorithms; Stock markets; Solid modeling; Predictive models; Algorithmic trading; deep learning; state representation learning; imitation learning; reinforcement learning
Categories
Funding
- Basic Science Research Program through the National Research Foundation of Korea (NRF) - Ministry of Education [NRF-2018R1D1A1B07043727]
- Kwangwoon University
Ask authors/readers for more resources
Algorithmic trading allows investors to avoid emotional decisions and make profits with modern technology. Two main challenges in algorithmic trading are extracting robust features and learning profitable trading policies.
Algorithmic trading allows investors to avoid emotional and irrational trading decisions and helps them make profits using modern computer technology. In recent years, reinforcement learning has yielded promising results for algorithmic trading. Two prominent challenges in algorithmic trading with reinforcement learning are (1) extracting robust features and (2) learning a profitable trading policy. Another challenge is that it was previously often assumed that both long and short positions are always possible in stock trading; however, taking a short position is risky or sometimes impossible in practice. We propose a practical algorithmic trading method, SIRL-Trader, which achieves good profit using only long positions. SIRL-Trader uses offline/online state representation learning (SRL) and imitative reinforcement learning. In offline SRL, we apply dimensionality reduction and clustering to extract robust features whereas, in online SRL, we co-train a regression model with a reinforcement learning model to provide accurate state information for decision-making. In imitative reinforcement learning, we incorporate a behavior cloning technique with the twin-delayed deep deterministic policy gradient (TD3) algorithm and apply multistep learning and dynamic delay to TD3. The experimental results show that SIRL-Trader yields higher profits and offers superior generalization ability compared with state-of-the-art methods.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available