期刊
MATHEMATICAL FINANCE
卷 30, 期 4, 页码 1273-1308出版社
WILEY
DOI: 10.1111/mafi.12281
关键词
empirical study; entropy regularization; Gaussian distribution; mean-variance portfolio selection; policy improvement; reinforcement learning; simulation; stochastic control; theorem; value function
类别
资金
- Nie Center for Intelligent Asset Management at Columbia
- Columbia University
- Nie Center for Intelligent Asset Management
We approach the continuous-time mean-variance portfolio selection with reinforcement learning (RL). The problem is to achieve the best trade-off between exploration and exploitation, and is formulated as an entropy-regularized, relaxed stochastic control problem. We prove that the optimal feedback policy for this problem must be Gaussian, with time-decaying variance. We then prove a policy improvement theorem, based on which we devise an implementable RL algorithm. We find that our algorithm and its variant outperform both traditional and deep neural network based algorithms in our simulation and empirical studies.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据