☆ 3.8 Proceedings Paper

NEURAL AUDIO-TO-SCORE MUSIC TRANSCRIPTION FOR UNCONSTRAINED POLYPHONY USING COMPACT OUTPUT REPRESENTATIONS

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) (2022)

期刊

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)

卷 -, 期 -, 页码 4603-4607

出版社

IEEE

DOI: 10.1109/ICASSP43922.2022.9746239

关键词

Audio-to-Score Transcription; Connectionist Temporal Classification; Unconstrained Polyphony

类别

Acoustics Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

MCIN/AEI [PID2020-118447RA-I00]
FEDER [IDIFEDER/2020/003]
Valencian Government [IDIFEDER/2020/003]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This research introduces a new output representation to address the limitations of sequence-based A2S recognition framework and provides an initial approximation for dealing with unconstrained polyphony. The proposed method is validated using synthetic audio from string quartets and piano sonatas with intricate polyphonic mixtures, and it improves the state-of-the-art rates for fixed-polyphony.

Neural Audio-to-Score (A2S) Music Transcription systems have shown promising results with pieces containing a fixed number of voices. However, they still exhibit fundamental limitations that constrain their applicability in wider scenarios. This work aims at tackling two of them: we introduce a novel output representation which addresses shortcomings related to the sequence-based A2S recognition framework and we report a first approximation to dealing with unconstrained polyphony. This is validated on a Convolutional Recurrent Neural Network (CRNN) with Connectionist Temporal Classification (CTC) A2S scheme using synthetic audio from string quartets and piano sonatas with intricate polyphonic mixtures. Our results, which improve fixed-polyphony state-of-the-art rates, may be considered a reference for future A2S works dealing with an unconstrained number of voices.

NEURAL AUDIO-TO-SCORE MUSIC TRANSCRIPTION FOR UNCONSTRAINED POLYPHONY USING COMPACT OUTPUT REPRESENTATIONS

期刊

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

NEURAL AUDIO-TO-SCORE MUSIC TRANSCRIPTION FOR UNCONSTRAINED POLYPHONY USING COMPACT OUTPUT REPRESENTATIONS

期刊

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文