Neural language model embeddings for Named Entity Recognition: A study from language perspective

Muskaan Maurya; Anupam Mandal; Manoj Maurya; Naval Gupta; Somya Nayak

Neural language model embeddings for Named Entity Recognition: A study from language perspective

Muskaan Maurya, Anupam Mandal, Manoj Maurya, Naval Gupta, Somya Nayak

Abstract

Named entity recognition (NER) models based on neural language models (LMs) exhibit stateof-the-art performance. However, the performance of such LMs have not been studied in detail with respect to finer language related aspects in the context of NER tasks. Such a study will be helpful in effective application of these models for cross-lingual and multilingual NER tasks. In this study, we examine the effects of script, vocabulary sharing, foreign names and pooling of multilanguage training data for building NER models. It is observed that monolingual BERT embeddings show the highest recognition accuracy among all transformerbased LMs for monolingual NER models. It is also seen that vocabulary sharing and data augmentation with foreign named entities (NEs) are most effective towards improving accuracy of cross-lingual NER models. Multilingual NER models trained by pooling data from similar languages can address training data inadequacy and exhibit performance close to that of monolingual models trained with adequate NER-tagged data of a single language.

Anthology ID:: 2023.icon-1.5
Volume:: Proceedings of the 20th International Conference on Natural Language Processing (ICON)
Month:: December
Year:: 2023
Address:: Goa University, Goa, India
Editors:: Jyoti D. Pawar, Sobha Lalitha Devi
Venue:: ICON
SIG:: SIGLEX
Publisher:: NLP Association of India (NLPAI)
Note:
Pages:: 44–51
Language:
URL:: https://aclanthology.org/2023.icon-1.5/
DOI:
Bibkey:
Cite (ACL):: Muskaan Maurya, Anupam Mandal, Manoj Maurya, Naval Gupta, and Somya Nayak. 2023. Neural language model embeddings for Named Entity Recognition: A study from language perspective. In Proceedings of the 20th International Conference on Natural Language Processing (ICON), pages 44–51, Goa University, Goa, India. NLP Association of India (NLPAI).
Cite (Informal):: Neural language model embeddings for Named Entity Recognition: A study from language perspective (Maurya et al., ICON 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.icon-1.5.pdf

PDF Cite Search Fix data