Journal
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
Volume 18, Issue 6, Pages 1280-1289Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TASL.2009.2032947
Keywords
Chord transcription; dynamic Bayesian networks (DBNs); music signal processing
Categories
Funding
- EPSRC [EP/E017614/1]
- Engineering and Physical Sciences Research Council [EP/E017614/1] Funding Source: researchfish
- EPSRC [EP/E017614/1] Funding Source: UKRI
Ask authors/readers for more resources
Chord labels provide a concise description of musical harmony. In pop and jazz music, a sequence of chord labels is often the only written record of a song, and forms the basis of so-called lead sheets. We devise a fully automatic method to simultaneously estimate from an audio waveform the chord sequence including bass notes, the metric positions of chords, and the key. The core of the method is a six-layered dynamic Bayesian network, in which the four hidden source layers jointly model metric position, key, chord, and bass pitch class, while the two observed layers model low-level audio features corresponding to bass and treble tonal content. Using 109 different chords our method provides substantially more harmonic detail than previous approaches while maintaining a high level of accuracy. We show that with 71% correctly classified chords our method significantly exceeds the state of the art when tested against manually annotated ground truth transcriptions on the 176 audio tracks from the MIREX 2008 Chord Detection Task. We introduce a measure of segmentation quality and show that bass and meter modeling are especially beneficial for obtaining the correct level of granularity.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available