Zero-Shot Cross-lingual Name Retrieval for Low-Resource Languages

Kevin Blissett, Heng Ji


Abstract
In this paper we address a challenging cross-lingual name retrieval task. Given an English named entity query, we aim to find all name mentions in documents in low-resource languages. We present a novel method which relies on zero annotation or resources from the target language. By leveraging freely available, cross-lingual resources and a small amount of training data from another language, we are able to perform name retrieval on a new language without any additional training data. Our method proceeds in a multi-step process: first, we pre-train a language-independent orthographic encoder using Wikipedia inter-lingual links from dozens of languages. Next, we gather user expectations about important entities in an English comparable document and compare those expected entities with actual spans of the target language text in order to perform name finding. Our method shows 11.6% absolute F-score improvement over state-of-the-art methods.
Anthology ID:
D19-6131
Volume:
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Colin Cherry, Greg Durrett, George Foster, Reza Haffari, Shahram Khadivi, Nanyun Peng, Xiang Ren, Swabha Swayamdipta
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
275–280
Language:
URL:
https://aclanthology.org/D19-6131
DOI:
10.18653/v1/D19-6131
Bibkey:
Cite (ACL):
Kevin Blissett and Heng Ji. 2019. Zero-Shot Cross-lingual Name Retrieval for Low-Resource Languages. In Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), pages 275–280, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Zero-Shot Cross-lingual Name Retrieval for Low-Resource Languages (Blissett & Ji, 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-6131.pdf