4.6 Article

Discriminatory and Orthogonal Feature Learning for Noise Robust Keyword Spotting

期刊

IEEE SIGNAL PROCESSING LETTERS
卷 29, 期 -, 页码 1913-1917

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LSP.2022.3203911

关键词

Measurement; Computational modeling; Feature extraction; Mathematical models; Convolution; Training; Euclidean distance; Keyword Spotting; robustness; metric learning

资金

  1. Korea Environment Industry Technology Institute through Exotic Invasive Species Management Program - Korea Ministry of Environment [2021002280004]

向作者/读者索取更多资源

Keyword Spotting is crucial for smart devices to respond to user commands, and the LOVO loss introduced in this study helps enhance the network's ability to extract discriminative features in noisy environments.
Keyword Spotting (KWS) is an essential component in a smart device for alerting the system when a user prompts it with a command. As these devices are typically constrained by computational and energy resources, the KWS model should be designed with a small footprint. In our previous work, we developed lightweight dynamic filters which extract a robust feature map within a noisy environment. The learning variables of the dynamic filter are jointly optimized with KWS weights by using Cross-Entropy (CE) loss. CE loss alone, however, is not sufficient for high performance when the SNR is low. In order to train the network for more robust performance in noisy environments, we introduce the LOw Variant Orthogonal (LOVO) loss. The LOVO loss is composed of a triplet loss applied on the output of the dynamic filter, a spectral norm-based orthogonal loss, and an inner class distance loss applied in the KWS model. These losses are particularly useful in encouraging the network to extract discriminatory features in unseen noise environments.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据