Abstract:

We present OpenMU-LightBench, a large-scale benchmark for training and evaluating music understanding models based on large language models (LLMs). OpenMU-LightBench consists of approximately one million data examples of two music understanding subtasks: music captioning and music reasoning. We provide details on the construction process of OpenMU-LightBench, including metadata collection and conversion. Next, we showcase data generated by prompting GPT-3.5. We release OpenMU-LightBench, and hope that its rich annotations can facilitate future research and development of building music understanding models based on LLMs.