☆ 4.7 Article

End-to-end keyword search system based on attention mechanism and energy scorer for low resource languages

NEURAL NETWORKS (2021)

Journal

NEURAL NETWORKS

Volume 139, Issue -, Pages 326-334

Publisher

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.neunet.2021.04.002

Keywords

Keyword search; End-to-end; Low resource language; Deep neural network

Funding

National Natural Science Foundation of China, China [U1836219]
National Key R&D Program of China [2019GQG0001]
Institute for Guo Qiang of Tsinghua University, China [2019GQG0001]
Cross-Media Intelligent Technology Project of Beijing National Research Center for Information Science and Technology (BNRist), China [BNR2019TD01022]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Keyword search (KWS) involves searching for keywords from continuous speech, and recent advancements in deep learning have allowed for end-to-end (E2E) training of KWS systems. The proposed E2E model outperforms baseline models and demonstrates effectiveness under low resource conditions.

Keyword search (KWS) means searching for keywords given by the user from continuous speech. Conventional KWS systems are based on Automatic Speech Recognition (ASR), where the input speech has to be first processed by the ASR system before keyword searching. In the recent decade, as deep learning and deep neural networks (DNN) become increasingly popular, KWS systems can also be trained in an end-to-end (E2E) manner. The main advantage of E2E KWS is that there is no need for speech recognition, which makes the training and searching procedure much more straightforward than the traditional ones. This article proposes an E2E KWS model, which consists of four parts: speech encoder-decoder, query encoder-decoder, attention mechanism, and energy scorer. Firstly, the proposed model outperforms the baseline model. Secondly, we find that under various supervision, character or phoneme sequences, speech or query encoders can extract the corresponding information, resulting in different performances. Moreover, we introduce an attention mechanism and invent a novel energy scorer, where the former can help locate keywords. The latter can make final decisions by considering speech embeddings, query embeddings, and attention weights in parallel. We evaluate our model on low resource conditions with about 10-hour training data for four different languages. The experiment results prove that the proposed model can work well on low resource conditions. (C) 2021 Elsevier Ltd. All rights reserved.

End-to-end keyword search system based on attention mechanism and energy scorer for low resource languages

Journal

NEURAL NETWORKS

Publisher

PERGAMON-ELSEVIER SCIENCE LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

End-to-end keyword search system based on attention mechanism and energy scorer for low resource languages

Journal

NEURAL NETWORKS

Publisher

PERGAMON-ELSEVIER SCIENCE LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper