A Feature-based Ensemble Approach to Recognition of Emerging and Rare Named Entities

Utpal Kumar Sikdar; Björn Gambäck

doi:10.18653/v1/W17-4424

A Feature-based Ensemble Approach to Recognition of Emerging and Rare Named Entities

Abstract

Detecting previously unseen named entities in text is a challenging task. The paper describes how three initial classifier models were built using Conditional Random Fields (CRFs), Support Vector Machines (SVMs) and a Long Short-Term Memory (LSTM) recurrent neural network. The outputs of these three classifiers were then used as features to train another CRF classifier working as an ensemble. 5-fold cross-validation based on training and development data for the emerging and rare named entity recognition shared task showed precision, recall and F1-score of 66.87%, 46.75% and 54.97%, respectively. For surface form evaluation, the CRF ensemble-based system achieved precision, recall and F1 scores of 65.18%, 45.20% and 53.30%. When applied to unseen test data, the model reached 47.92% precision, 31.97% recall and 38.55% F1-score for entity level evaluation, with the corresponding surface form evaluation values of 44.91%, 30.47% and 36.31%.

Anthology ID:: W17-4424
Volume:: Proceedings of the 3rd Workshop on Noisy User-generated Text
Month:: September
Year:: 2017
Address:: Copenhagen, Denmark
Editors:: Leon Derczynski, Wei Xu, Alan Ritter, Tim Baldwin
Venue:: WNUT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 177–181
Language:
URL:: https://aclanthology.org/W17-4424/
DOI:: 10.18653/v1/W17-4424
Bibkey:
Cite (ACL):: Utpal Kumar Sikdar and Björn Gambäck. 2017. A Feature-based Ensemble Approach to Recognition of Emerging and Rare Named Entities. In Proceedings of the 3rd Workshop on Noisy User-generated Text, pages 177–181, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):: A Feature-based Ensemble Approach to Recognition of Emerging and Rare Named Entities (Sikdar & Gambäck, WNUT 2017)
Copy Citation:
PDF:: https://aclanthology.org/W17-4424.pdf

PDF Cite Search Fix data