Ketevan Tsereteli


2019

pdf bib
A Semi-Markov Structured Support Vector Machine Model for High-Precision Named Entity Recognition
Ravneet Arora | Chen-Tse Tsai | Ketevan Tsereteli | Prabhanjan Kambadur | Yi Yang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Named entity recognition (NER) is the backbone of many NLP solutions. F1 score, the harmonic mean of precision and recall, is often used to select/evaluate the best models. However, when precision needs to be prioritized over recall, a state-of-the-art model might not be the best choice. There is little in literature that directly addresses training-time modifications to achieve higher precision information extraction. In this paper, we propose a neural semi-Markov structured support vector machine model that controls the precision-recall trade-off by assigning weights to different types of errors in the loss-augmented inference during training. The semi-Markov property provides more accurate phrase-level predictions, thereby improving performance. We empirically demonstrate the advantage of our model when high precision is required by comparing against strong baselines based on CRF. In our experiments with the CoNLL 2003 dataset, our model achieves a better precision-recall trade-off at various precision levels.