Simple Yet Powerful: An Overlooked Architecture for Nested Named Entity Recognition

Matías Rojas; Felipe Bravo-Marquez; Jocelyn Dunstan

Simple Yet Powerful: An Overlooked Architecture for Nested Named Entity Recognition

Matias Rojas, Felipe Bravo-Marquez, Jocelyn Dunstan

Abstract

Named Entity Recognition (NER) is an important task in Natural Language Processing that aims to identify text spans belonging to predefined categories. Traditional NER systems ignore nested entities, which are entities contained in other entity mentions. Although several methods have been proposed to address this case, most of them rely on complex task-specific structures and ignore potentially useful baselines for the task. We argue that this creates an overly optimistic impression of their performance. This paper revisits the Multiple LSTM-CRF (MLC) model, a simple, overlooked, yet powerful approach based on training independent sequence labeling models for each entity type. Extensive experiments with three nested NER corpora show that, regardless of the simplicity of this model, its performance is better or at least as well as more sophisticated methods. Furthermore, we show that the MLC architecture achieves state-of-the-art results in the Chilean Waiting List corpus by including pre-trained language models. In addition, we implemented an open-source library that computes task-specific metrics for nested NER. The results suggest that metrics used in previous work do not measure well the ability of a model to detect nested entities, while our metrics provide new evidence on how existing approaches handle the task.

Anthology ID:: 2022.coling-1.184
Volume:: Proceedings of the 29th International Conference on Computational Linguistics
Month:: October
Year:: 2022
Address:: Gyeongju, Republic of Korea
Editors:: Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 2108–2117
Language:
URL:: https://aclanthology.org/2022.coling-1.184/
DOI:
Bibkey:
Cite (ACL):: Matias Rojas, Felipe Bravo-Marquez, and Jocelyn Dunstan. 2022. Simple Yet Powerful: An Overlooked Architecture for Nested Named Entity Recognition. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2108–2117, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):: Simple Yet Powerful: An Overlooked Architecture for Nested Named Entity Recognition (Rojas et al., COLING 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.coling-1.184.pdf

PDF Cite Search Fix data