Low-resource named entity recognition via multi-source projection: Not quite there yet?

Jan Vium Enghoff, Søren Harrison, Željko Agić


Abstract
Projecting linguistic annotations through word alignments is one of the most prevalent approaches to cross-lingual transfer learning. Conventional wisdom suggests that annotation projection “just works” regardless of the task at hand. We carefully consider multi-source projection for named entity recognition. Our experiment with 17 languages shows that to detect named entities in true low-resource languages, annotation projection may not be the right way to move forward. On a more positive note, we also uncover the conditions that do favor named entity projection from multiple sources. We argue these are infeasible under noisy low-resource constraints.
Anthology ID:
W18-6125
Volume:
Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text
Month:
November
Year:
2018
Address:
Brussels, Belgium
Editors:
Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
195–201
Language:
URL:
https://aclanthology.org/W18-6125
DOI:
10.18653/v1/W18-6125
Bibkey:
Cite (ACL):
Jan Vium Enghoff, Søren Harrison, and Željko Agić. 2018. Low-resource named entity recognition via multi-source projection: Not quite there yet?. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text, pages 195–201, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Low-resource named entity recognition via multi-source projection: Not quite there yet? (Enghoff et al., WNUT 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-6125.pdf