SemEval-2025 Task 5: LLMs4Subjects - LLM-based Automated Subject Tagging for a National Technical Library’s Open-Access Catalog

Jennifer D’Souza; Sameer Sadruddin; Holger Israel; Mathias Begoin; Diana Slawig

SemEval-2025 Task 5: LLMs4Subjects - LLM-based Automated Subject Tagging for a National Technical Library’s Open-Access Catalog

Jennifer D’souza, Sameer Sadruddin, Holger Israel, Mathias Begoin, Diana Slawig

Abstract

We present SemEval-2025 Task 5: LLMs4Subjects, a shared task on automated subject tagging for scientific and technical records in English and German using the GND taxonomy. Participants developed LLM-based systems to recommend top-k subjects, evaluated through quantitative metrics (precision, recall, F1-score) and qualitative assessments by subject specialists. Results highlight the effectiveness of LLM ensembles, synthetic data generation, and multilingual processing, offering insights into LLMs for digital library classification. The task attracted over 700 participants. We received final submissions from more than 200 teams and 93 system description papers. We report baseline results, as well as findings on the best-performing systems, the most common approaches, and the most effective methods across various tracks and languages. The datasets for this task are publicly available. The dataset is available at {href{https://github.com/emotion-analysis-project/SemEval2025-task11}{SemEval2024-task 11}}.

Anthology ID:: 2025.semeval-1.328
Volume:: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2570–2583
Language:
URL:: https://aclanthology.org/2025.semeval-1.328/
DOI:
Bibkey:
Cite (ACL):: Jennifer D’souza, Sameer Sadruddin, Holger Israel, Mathias Begoin, and Diana Slawig. 2025. SemEval-2025 Task 5: LLMs4Subjects - LLM-based Automated Subject Tagging for a National Technical Library’s Open-Access Catalog. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2570–2583, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: SemEval-2025 Task 5: LLMs4Subjects - LLM-based Automated Subject Tagging for a National Technical Library’s Open-Access Catalog (D’souza et al., SemEval 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.semeval-1.328.pdf

PDF Cite Search Fix data