4.3 Article

Reinforcement Learning-Based Autonomous Navigation and Obstacle Avoidance for USVs under Partially Observable Conditions

Journal

MATHEMATICAL PROBLEMS IN ENGINEERING
Volume 2021, Issue -, Pages -

Publisher

HINDAWI LTD
DOI: 10.1155/2021/5519033

Keywords

-

Funding

  1. National Natural Science Foundation of China Youth Fund [61902001]
  2. Anhui Province University Outstanding Youth Talent Support Program
  3. Natural Science Project of Anhui Polytechnic University [Xjky072019A01]

Ask authors/readers for more resources

This paper proposes a UANOA method based on deep reinforcement learning, which achieves the autonomous navigation and obstacle avoidance tasks of USVs, and shows good performance in complex ocean environments.
Unmanned surface vehicles (USVs) have been widely used in research and exploration, patrol, and defense. Autonomous navigation and obstacle avoidance, as the essential technology of USVs, are the key conditions for successful mission execution. However, fine modeling of conventional algorithms cannot meet the real-time precise behavior control strategy of USVs in complex environments, which poses a great challenge to autonomous control policy. In this paper, a deep reinforcement learning-based UANOA (USVs autonomous navigation and obstacle avoidance) method is proposed. The UANOA achieves the autonomous navigation task of USVs by real-time sensing of partially complex ocean information around and real-time output of rudder angle control commands of USVs. In our work, we employ a double Q-network to achieve end-to-end control from raw sensor input to output of discrete rudder action, and design a set of reward functions that can be adapted to USV navigation and obstacle avoidance. To alleviate the decision bias caused by partial observable of USVs, we use the long short-term memory (LSTM) networks to enhance the ability to remember the ocean environment of USVs. Experiments demonstrate that UANOA ensures a USV arrives at the target points with optimal path planning in complex ocean environments without any collisions occurring, and UANOA outperforms deep Q-network (DQN) and random control policy in convergence speed, sailing distance, rudder angle steering consumption, and other performance measurements.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available