Towards a Unified Multi-Domain Multilingual Named Entity Recognition Model

Mayank Kulkarni, Daniel Preotiuc-Pietro, Karthik Radhakrishnan, Genta Indra Winata, Shijie Wu, Lingjue Xie, Shaohua Yang


Abstract
Named Entity Recognition is a key Natural Language Processing task whose performance is sensitive to choice of genre and language. A unified NER model across multiple genres and languages is more practical and efficient by leveraging commonalities across genres or languages. In this paper, we propose a novel setup for NER which includes multi-domain and multilingual training and evaluation across 13 domains and 4 languages. We explore a range of approaches to building a unified model using domain and language adaptation techniques. Our experiments highlight multiple nuances to consider while building a unified model, including that naive data pooling fails to obtain good performance, that domain-specific adaptations are more important than language-specific ones and that including domain-specific adaptations in a unified model nears the performance of training multiple dedicated monolingual models at a fraction of their parameter count.
Anthology ID:
2023.eacl-main.161
Volume:
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Andreas Vlachos, Isabelle Augenstein
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2210–2219
Language:
URL:
https://aclanthology.org/2023.eacl-main.161
DOI:
10.18653/v1/2023.eacl-main.161
Bibkey:
Cite (ACL):
Mayank Kulkarni, Daniel Preotiuc-Pietro, Karthik Radhakrishnan, Genta Indra Winata, Shijie Wu, Lingjue Xie, and Shaohua Yang. 2023. Towards a Unified Multi-Domain Multilingual Named Entity Recognition Model. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2210–2219, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Towards a Unified Multi-Domain Multilingual Named Entity Recognition Model (Kulkarni et al., EACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.eacl-main.161.pdf
Video:
 https://aclanthology.org/2023.eacl-main.161.mp4