Notewise Evaluation of Source Separation: A Case Study For Separated Piano Tracks

Yigitcan Özer (International Audio Laboratories Erlangen)*, Hans-Ulrich Berendes (International Audio Laboratories Erlangen), Vlora Arifi-Müller (International Audio Laboratories Erlangen ), Fabian-Robert Stöter (AudioShake, Inc.), Meinard Müller (International Audio Laboratories Erlangen)

Keywords: Evaluation, datasets, and reproducibility -> evaluation metrics, Evaluation, datasets, and reproducibility -> annotation protocols; Evaluation, datasets, and reproducibility -> evaluation methodology; Evaluation, datasets, and reproducibility -> novel datasets and use cases; Knowledge-driven approaches to MIR -> machine learning/artificial intelligence for music; MIR tasks -> sound source separation

Abstract:

Deep learning has significantly advanced music source separation (MSS), aiming to decompose music recordings into individual tracks corresponding to singing or specific instruments. Typically, results are evaluated using quantitative measures like signal-to-distortion ratio (SDR) computed for entire excerpts or songs. As the main contribution of this article, we introduce a novel evaluation approach that decomposes an audio track into musically meaningful sound events and applies the evaluation metric based on these units. In a case study, we apply this strategy to the challenging task of separating piano concerto recordings into piano and orchestra tracks. To assess piano separation quality, we use a score-informed nonnegative matrix factorization approach to decompose the reference and separated piano tracks into notewise sound events. In our experiments assessing various MSS systems, we demonstrate that our notewise evaluation, which takes into account factors such as pitch range and musical complexity, enhances the comprehension of both the results of source separation and the intricacies within the underlying music.

Reviews

No reviews available