Leveraging User-Generated Metadata of Online Videos for Cover Song Identification

Simon Hachmeier, Robert Jäschke


Abstract
YouTube is a rich source of cover songs. Since the platform itself is organized in terms of videos rather than songs, the retrieval of covers is not trivial. The field of cover song identification addresses this problem and provides approaches that usually rely on audio content. However, including the user-generated video metadata available on YouTube promises improved identification results. In this paper, we propose a multi-modal approach for cover song identification on online video platforms. We combine the entity resolution models with audio-based approaches using a ranking model. Our findings implicate that leveraging user-generated metadata can stabilize cover song identification performance on YouTube.
Anthology ID:
2024.nlp4musa-1.8
Volume:
Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA)
Month:
November
Year:
2024
Address:
Oakland, USA
Editors:
Anna Kruspe, Sergio Oramas, Elena V. Epure, Mohamed Sordo, Benno Weck, SeungHeon Doh, Minz Won, Ilaria Manco, Gabriel Meseguer-Brocal
Venues:
NLP4MusA | WS
SIG:
Publisher:
Association for Computational Lingustics
Note:
Pages:
43–48
Language:
URL:
https://aclanthology.org/2024.nlp4musa-1.8/
DOI:
Bibkey:
Cite (ACL):
Simon Hachmeier and Robert Jäschke. 2024. Leveraging User-Generated Metadata of Online Videos for Cover Song Identification. In Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA), pages 43–48, Oakland, USA. Association for Computational Lingustics.
Cite (Informal):
Leveraging User-Generated Metadata of Online Videos for Cover Song Identification (Hachmeier & Jäschke, NLP4MusA 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nlp4musa-1.8.pdf