YNU-HPCC at SemEval-2025 Task 5: Contrastive Learning for GND Subject Tagging with Multilingual Sentence-BERT

Hong Jiang; Jin Wang; Xuejie Zhang

YNU-HPCC at SemEval-2025 Task 5: Contrastive Learning for GND Subject Tagging with Multilingual Sentence-BERT

Abstract

This paper describes YNU-HPCC(Alias JH) team’s participation in the sub-task 2 of the SemEval-2025 Task 5, which requires fine-tuning language models to align subject tags with the TIBKAT collection. The task presents three key challenges: cross-disciplinary document coverage, bilingual (English-German) processing requirements, and extreme classification over 200,000 GND Subjects. To address these challenges, we apply a contrastive learning framework using multilingual Sentence-BERT models, implementing two innovative training strategies: mixed-negative multi-label sampling, and single-label sampling with random negative selection. Our best-performing model achieves significant improvements of 28.6% in average recall, reaching 0.2252 on the core-test set and 0.1677 on the all-test set. Notably, we reveal model architecture-dependent response patterns: MiniLM-series models benefit from multi-label training (+33.5% zero-shot recall), while mpnet variants excel with single-label approaches (+230.3% zero-shot recall). The study further demonstrates the effectiveness of contrastive learning for multilingual semantic alignment in low-resource scenarios, providing insights for extreme classification tasks.

Anthology ID:: 2025.semeval-1.318
Volume:: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2443–2448
Language:
URL:: https://aclanthology.org/2025.semeval-1.318/
DOI:
Bibkey:
Cite (ACL):: Hong Jiang, Jin Wang, and Xuejie Zhang. 2025. YNU-HPCC at SemEval-2025 Task 5: Contrastive Learning for GND Subject Tagging with Multilingual Sentence-BERT. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2443–2448, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: YNU-HPCC at SemEval-2025 Task 5: Contrastive Learning for GND Subject Tagging with Multilingual Sentence-BERT (Jiang et al., SemEval 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.semeval-1.318.pdf

PDF Cite Search Fix data