4.7 Article

Deep-Learning-Assisted Sound Source Localization From a Flying Drone

期刊

IEEE SENSORS JOURNAL
卷 22, 期 21, 页码 20828-20838

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSEN.2022.3207660

关键词

Drones; Time-frequency analysis; Location awareness; Sensors; Estimation; Noise measurement; Microphone arrays; Deep neural network (DNN); drone audition; ego-noise reduction; microphone array; sound source localization

向作者/读者索取更多资源

This study proposes a deep-learning-based framework for sound source localization from a flying drone. It effectively addresses the challenges of strong ego-noise and motion in the drone. By integrating single-channel noise reduction and multichannel source localization algorithms, the framework robustly processes signals in dynamic and low SNR scenarios, surpassing competing methods.
Sound source localization from a flying drone is a challenging task due to the strong ego-noise from rotating motors and propellers as well as the movement of the drone and the sound sources. To address this challenge, we propose a deep-learning-based framework that integrates single-channel noise reduction and multichannel source localization. In this framework, we suppress the ego-noise and estimate a time-frequency soft ratio mask with a single-channel deep neural network (DNN). Then, we design two downstream multichannel source localization algorithms, based on steered response power (SRP-DNN) and time-frequency spatial filtering (TFS-DNN). The main novelty lies in the proposed TFS-DNN approach, which estimates the presence probability of the target sound at the individual time-frequency bins by combining the DNN-inferred soft ratio mask and the instantaneous direction of arrival (DOA) of the sound received by the microphone array. The time-frequency presence probability of the target sound is then used to design a set of spatial filters to construct a spatial likelihood map for source localization. By jointly exploiting spectral and spatial information, TFS-DNN robustly processes signals in short segments (e.g., 0.5 s) in dynamic and low signal-to-noise-ratio (SNR) scenarios (e.g., SNR -20 dB). Results on real and simulated data in a variety of scenarios (static sources, moving sources, and moving drones) indicate the advantage of TFS-DNN over competing methods, including SRP-DNN and the state-of-the-art TFS.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据