Combining Minimally-supervised Methods for Arabic Named Entity Recognition

Maha Althobaiti, Udo Kruschwitz, Massimo Poesio


Abstract
Supervised methods can achieve high performance on NLP tasks, such as Named Entity Recognition (NER), but new annotations are required for every new domain and/or genre change. This has motivated research in minimally supervised methods such as semi-supervised learning and distant learning, but neither technique has yet achieved performance levels comparable to those of supervised methods. Semi-supervised methods tend to have very high precision but comparatively low recall, whereas distant learning tends to achieve higher recall but lower precision. This complementarity suggests that better results may be obtained by combining the two types of minimally supervised methods. In this paper we present a novel approach to Arabic NER using a combination of semi-supervised and distant learning techniques. We trained a semi-supervised NER classifier and another one using distant learning techniques, and then combined them using a variety of classifier combination schemes, including the Bayesian Classifier Combination (BCC) procedure recently proposed for sentiment analysis. According to our results, the BCC model leads to an increase in performance of 8 percentage points over the best base classifiers.
Anthology ID:
Q15-1018
Volume:
Transactions of the Association for Computational Linguistics, Volume 3
Month:
Year:
2015
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
243–255
Language:
URL:
https://aclanthology.org/Q15-1018
DOI:
10.1162/tacl_a_00136
Bibkey:
Cite (ACL):
Maha Althobaiti, Udo Kruschwitz, and Massimo Poesio. 2015. Combining Minimally-supervised Methods for Arabic Named Entity Recognition. Transactions of the Association for Computational Linguistics, 3:243–255.
Cite (Informal):
Combining Minimally-supervised Methods for Arabic Named Entity Recognition (Althobaiti et al., TACL 2015)
Copy Citation:
PDF:
https://aclanthology.org/Q15-1018.pdf