4.7 Article

Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy

期刊

PHYSICS OF FLUIDS
卷 34, 期 5, 页码 -

出版社

AIP Publishing
DOI: 10.1063/5.0086871

关键词

-

资金

  1. University of Manchester
  2. China Scholarship Council

向作者/读者索取更多资源

This study introduces a hybrid DRL method with a Markov decision process (MDP) with time delays and a first-order autoregressive policy (ARP) to control the vortex-shedding process of a two-dimensional circular cylinder. Compared to the standard DRL method, this method achieves a more stable and effective reduction of force fluctuations in the vortex-shedding process.
Classical active flow control (AFC) methods based on solving the Navier-Stokes equations are laborious and computationally intensive even with the use of reduced-order models. Data-driven methods offer a promising alternative for AFC, and they have been applied successfully to reduce the drag of two-dimensional bluff bodies, such as a circular cylinder, using deep reinforcement-learning (DRL) paradigms. However, due to the onset of weak turbulence in the wake, the standard DRL method tends to result in large fluctuations in the unsteady forces acting on the cylinder as the Reynolds number increases. In this study, a Markov decision process (MDP) with time delays is introduced to model and quantify the action delays in the environment in a DRL process due to the time difference between control actuation and flow response along with the use of a first-order autoregressive policy (ARP). This hybrid DRL method is applied to control the vortex-shedding process from a two-dimensional circular cylinder using four synthetic jet actuators at a freestream Reynolds number of 400. This method has yielded a stable and coherent control, which results in a steadier and more elongated vortex formation zone behind the cylinder, hence, a much weaker vortex-shedding process and less fluctuating lift and drag forces. Compared to the standard DRL method, this method utilizes the historical samples without additional sampling in training, and it is capable of reducing the magnitude of drag and lift fluctuations by approximately 90% while achieving a similar level of drag reduction in the deterministic control at the same actuation frequency. This study demonstrates the necessity of including a physics-informed delay and regressive nature in the MDP and the benefits of introducing ARPs to achieve a robust and temporal-coherent control of unsteady forces in active flow control. Published under an exclusive license by AIP Publishing.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据