☆ 4.5 Article

Toward learning robust contrastive embeddings for binaural sound source localization

FRONTIERS IN NEUROINFORMATICS (2022)

期刊

FRONTIERS IN NEUROINFORMATICS

卷 16, 期 -, 页码 -

出版社

FRONTIERS MEDIA SA

DOI: 10.3389/fninf.2022.942978

关键词

manifold learning; non-linear dimension reduction; siamese neural network; binaural sound source localization; deep learning

类别

Mathematical & Computational Biology Neurosciences

资金

Chinese Scholarship Council (CSC) [201707650021]
Postdoctoral Fellow of the Research Foundation Flanders-FWO-Vlaanderen [12X6719N]
KU Leuven Internal Funds [C2-16-00449, VES/19/004]
European Research Council under the European Union [773268]
European Research Council (ERC) [773268] Funding Source: European Research Council (ERC)

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes a parametric embedding method that maps binaural cues to a low-dimensional space for source localization. The proposed embedding performs well in various acoustic conditions and outperforms previous unsupervised embeddings and direct estimation models, especially with limited training data.

Recent deep neural network based methods provide accurate binaural source localization performance. These data-driven models map measured binaural cues directly to source locations hence their performance highly depend on the training data distribution. In this paper, we propose a parametric embedding that maps the binaural cues to a low-dimensional space where localization can be done with a nearest-neighbor regression. We implement the embedding using a neural network, optimized to map points that are close to each other in the latent space (the space of source azimuths or elevations) to nearby points in the embedding space, thus the Euclidean distances between the embeddings reflect their source proximities, and the structure of the embeddings forms a manifold, which provides interpretability to the embeddings. We show that the proposed embedding generalizes well in various acoustic conditions (with reverberation) different from those encountered during training, and provides better performance than unsupervised embeddings previously used for binaural localization. In addition, the proposed method performs better than or equally well as a feed-forward neural network based model that directly estimates the source locations from the binaural cues, and it has better results than the feed-forward model when a small amount of training data is used. Moreover, we also compare the proposed embedding using both supervised and weakly supervised learning, and show that in both conditions, the resulting embeddings perform similarly well, but the weakly supervised embedding allows to estimate source azimuth and elevation simultaneously.

Toward learning robust contrastive embeddings for binaural sound source localization

期刊

FRONTIERS IN NEUROINFORMATICS

出版社

FRONTIERS MEDIA SA

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Toward learning robust contrastive embeddings for binaural sound source localization

期刊

FRONTIERS IN NEUROINFORMATICS

出版社

FRONTIERS MEDIA SA

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文