☆ 4.7 Article

Characterization of Moving Sound Sources Direction-of-Arrival Estimation Using Different Deep Learning Architectures

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2023)

Journal

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

Volume 72, Issue -, Pages -

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TIM.2023.3241983

Keywords

Direction-of-arrival estimation; Acoustics; Estimation; Convolutional neural networks; Feature extraction; Task analysis; Deep learning; Direction-of-arrival (DOA) detection; machine learning; microphone arrays; moving acoustic sources; neural networks (NNs)

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This article evaluates the performance of a deep learning classification system for localizing moving sound sources and investigates the impact of key parameters in feature extraction and model training. The results show that window size has a significant effect on the performance of moving sources but not static sources, sequence length affects the performance of recurrent architectures, and a temporal convolutional neural network outperforms recurrent and feedforward networks for moving sound sources.

Sound source localization is an important task for several applications and the use of deep learning for this task has recently become a popular research topic. While a number of previous works have focused on static sound sources, in this article, we evaluate the performance of a deep learning classification system for localization of moving sound sources. In particular, we evaluate the effect of key parameters at the levels of feature extraction (e.g., short-time Fourier transform (STFT) parameters) and model training (e.g., neural network (NN) architectures). We evaluate the performance of different settings in terms of precision and F-score, in a multiclass multilabel classification framework. In our previous work for localization of moving sound sources, we investigated feedforward NNs (FNNs) under different acoustic conditions and STFT parameters and showed that the presence of some reverberation in the training dataset can help in achieving better detection for the direction of arrival of the sources. In this article, we extend the work to show that the window size does not affect the performance of static sources but highly affects the performance of moving sources, a sequence length has a significant effect on the performance of recurrent architectures, and a temporal convolutional NN can outperform both recurrent and feedforward networks for moving sound sources.

Characterization of Moving Sound Sources Direction-of-Arrival Estimation Using Different Deep Learning Architectures

Journal

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Characterization of Moving Sound Sources Direction-of-Arrival Estimation Using Different Deep Learning Architectures

Journal

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper