Journal
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Volume -, Issue -, Pages 885-889Publisher
IEEE
DOI: 10.1109/icassp.2019.8682909
Keywords
polyphonic sound event detection; convolutional recurrent neural network; bidirectional LSTM; transfer learning
Categories
Ask authors/readers for more resources
This paper presents a novel approach to improve the performance of polyphonic sound event detection that combines a convolutional bidirectional recurrent neural network (CBRNN) with transfer learning. The ordinary convolutional recurrent neural network (CRNN) is known to suffer from a vanishing gradient problem, which significantly reduces the efficiency of information transfer to past events. To resolve this issue, we combine forward and backward long short-term memory (LSTM) modules and demonstrate that they complement each other. To effectively deal with the issue of overfitting that arises from increased model complexity, we apply transfer learning with a dataset that contains synthesized artifacts. We show that the model achieves faster and better performance with less data. Simulations with the 2016 TUT dataset show that the performance of the CBRNN with transfer learning is dramatically improved compared to the ordinary CRNN; the F1 score was 28.4% higher, and the error rate was 0.42 lower.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available