Neural Cross-Lingual Named Entity Recognition with Minimal Resources

Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, Jaime Carbonell


Abstract
For languages with no annotated resources, unsupervised transfer of natural language processing models such as named-entity recognition (NER) from resource-rich languages would be an appealing capability. However, differences in words and word order across languages make it a challenging problem. To improve mapping of lexical items across languages, we propose a method that finds translations based on bilingual word embeddings. To improve robustness to word order differences, we propose to use self-attention, which allows for a degree of flexibility with respect to word order. We demonstrate that these methods achieve state-of-the-art or competitive NER performance on commonly tested languages under a cross-lingual setting, with much lower resource requirements than past approaches. We also evaluate the challenges of applying these methods to Uyghur, a low-resource language.
Anthology ID:
D18-1034
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
369–379
Language:
URL:
https://aclanthology.org/D18-1034
DOI:
10.18653/v1/D18-1034
Bibkey:
Cite (ACL):
Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, and Jaime Carbonell. 2018. Neural Cross-Lingual Named Entity Recognition with Minimal Resources. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 369–379, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Neural Cross-Lingual Named Entity Recognition with Minimal Resources (Xie et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1034.pdf
Code
 thespectrewithin/cross-lingual_NER
Data
CoNLL 2002