3.8 Proceedings Paper

POLYPHONIC SOUND EVENT DETECTION USING CONVOLUTIONAL BIDIRECTIONAL LSTM AND SYNTHETIC DATA-BASED TRANSFER LEARNING

Publisher

IEEE
DOI: 10.1109/icassp.2019.8682909

Keywords

polyphonic sound event detection; convolutional recurrent neural network; bidirectional LSTM; transfer learning

Ask authors/readers for more resources

This paper presents a novel approach to improve the performance of polyphonic sound event detection that combines a convolutional bidirectional recurrent neural network (CBRNN) with transfer learning. The ordinary convolutional recurrent neural network (CRNN) is known to suffer from a vanishing gradient problem, which significantly reduces the efficiency of information transfer to past events. To resolve this issue, we combine forward and backward long short-term memory (LSTM) modules and demonstrate that they complement each other. To effectively deal with the issue of overfitting that arises from increased model complexity, we apply transfer learning with a dataset that contains synthesized artifacts. We show that the model achieves faster and better performance with less data. Simulations with the 2016 TUT dataset show that the performance of the CBRNN with transfer learning is dramatically improved compared to the ordinary CRNN; the F1 score was 28.4% higher, and the error rate was 0.42 lower.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available