Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp

Shuguang Chen, Leonardo Neves, Thamar Solorio


Abstract
Performance of neural models for named entity recognition degrades over time, becoming stale. This degradation is due to temporal drift, the change in our target variables’ statistical properties over time. This issue is especially problematic for social media data, where topics change rapidly. In order to mitigate the problem, data annotation and retraining of models is common. Despite its usefulness, this process is expensive and time-consuming, which motivates new research on efficient model updating. In this paper, we propose an intuitive approach to measure the potential trendiness of tweets and use this metric to select the most informative instances to use for training. We conduct experiments on three state-of-the-art models on the Temporal Twitter Dataset. Our approach shows larger increases in prediction accuracy with less training data than the alternatives, making it an attractive, practical solution.
Anthology ID:
2021.socialnlp-1.14
Volume:
Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media
Month:
June
Year:
2021
Address:
Online
Editors:
Lun-Wei Ku, Cheng-Te Li
Venue:
SocialNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
163–169
Language:
URL:
https://aclanthology.org/2021.socialnlp-1.14
DOI:
10.18653/v1/2021.socialnlp-1.14
Bibkey:
Cite (ACL):
Shuguang Chen, Leonardo Neves, and Thamar Solorio. 2021. Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp. In Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media, pages 163–169, Online. Association for Computational Linguistics.
Cite (Informal):
Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp (Chen et al., SocialNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.socialnlp-1.14.pdf
Code
 RiTUAL-UH/trending_NER