Evaluation of Pretrained Language Models on Music Understanding

Yannis Vasilakis, Rachel Bittner, Johan Pauwels


Abstract
Music-text multimodal systems have enabled new approaches to Music Information Research (MIR) applications. Despite the reported success, there has been little effort in evaluating the musical knowledge of Large Language Models (LLM). We demonstrate that LLMs suffer from prompt sensitivity, inability to model negation and sensitivity towards specific words. We quantified these properties as a triplet-based accuracy, evaluating the ability to model the relative similarity of labels in a hierarchical ontology. We leveraged Audioset ontology to generate triplets consisting of anchor, positive and negative label for genre/instruments sub-tree and use six general-purpose Transformer-based models. Triplets required filtering, as some were difficult to judge and therefore relatively uninformative for evaluation purposes. Despite the relatively high accuracy reported, inconsistencies are evident in all six models, suggesting that off-the-shelf LLMs need adaptation to music before use.
Anthology ID:
2024.nlp4musa-1.16
Volume:
Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA)
Month:
November
Year:
2024
Address:
Oakland, USA
Editors:
Anna Kruspe, Sergio Oramas, Elena V. Epure, Mohamed Sordo, Benno Weck, SeungHeon Doh, Minz Won, Ilaria Manco, Gabriel Meseguer-Brocal
Venues:
NLP4MusA | WS
SIG:
Publisher:
Association for Computational Lingustics
Note:
Pages:
98–106
Language:
URL:
https://aclanthology.org/2024.nlp4musa-1.16/
DOI:
Bibkey:
Cite (ACL):
Yannis Vasilakis, Rachel Bittner, and Johan Pauwels. 2024. Evaluation of Pretrained Language Models on Music Understanding. In Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA), pages 98–106, Oakland, USA. Association for Computational Lingustics.
Cite (Informal):
Evaluation of Pretrained Language Models on Music Understanding (Vasilakis et al., NLP4MusA 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nlp4musa-1.16.pdf