Leveraging Probabilistic Graph Models in Nested Named Entity Recognition for Polish

Jędrzej Jamnicki


Abstract
This paper presents ongoing work on leveraging probabilistic graph models, specifically conditional random fields and hidden Markov models, in nested named entity recognition for the Polish language. NER is a crucial task in natural language processing that involves identifying and classifying named entities in text documents. Nested NER deals with recognizing hierarchical structures of entities that overlap with one another, presenting additional challenges. The paper discusses the methodologies and approaches used in nested NER, focusing on CRF and HMM. Related works and their contributions are reviewed, and experiments using the KPWr dataset are conducted, particularly with the BiLSTM-CRF model and Word2Vec and HerBERT embeddings. The results show promise in addressing nested NER for Polish, but further research is needed to develop robust and accurate models for this complex task.
Anthology ID:
2023.ranlp-stud.7
Volume:
Proceedings of the 8th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Momchil Hardalov, Zara Kancheva, Boris Velichkov, Ivelina Nikolova-Koleva, Milena Slavcheva
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
64–67
Language:
URL:
https://aclanthology.org/2023.ranlp-stud.7
DOI:
Bibkey:
Cite (ACL):
Jędrzej Jamnicki. 2023. Leveraging Probabilistic Graph Models in Nested Named Entity Recognition for Polish. In Proceedings of the 8th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing, pages 64–67, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Leveraging Probabilistic Graph Models in Nested Named Entity Recognition for Polish (Jamnicki, RANLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ranlp-stud.7.pdf