出版社
IEEE
DOI: 10.1109/SSCI50451.2021.9659979
关键词
Convolutional neural networks; deep learning; audio processing
This paper proposes a 1D residual convolutional neural network architecture for music genre classification, which outperforms other 1D CNN architectures by segmenting the audio signal to improve accuracy. Experimental results show that it achieves 80.93% mean accuracy in classifying music genres.
This paper proposes a 1D residual convolutional neural network (CNN) architecture for music genre classification and compares it with other recent 1D CNN architectures. The 1D CNNs learn a representation and a discriminant directly from the raw audio signal. Several convolutional layers capture the time-frequency characteristics of the audio signal and learn various filters relevant to the music genre recognition task. The proposed approach splits the audio signal into overlapped segments using a sliding window to comply with the fixed-length input constraint of the 1D CNNs. As a result, music genre classification can be carried out on a single audio segment or on aggregating the predictions on several audio segments, which improves the final accuracy. The performance of the proposed 1D residual CNN is assessed on a public dataset of 1,000 audio clips. The experimental results have shown that it achieves 80.93% of mean accuracy in classifying music genres and outperforms other 1D CNN architectures.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据