NORMA: Neighborhood Sensitive Maps for Multilingual Word Embeddings

Ndapa Nakashole


Abstract
Inducing multilingual word embeddings by learning a linear map between embedding spaces of different languages achieves remarkable accuracy on related languages. However, accuracy drops substantially when translating between distant languages. Given that languages exhibit differences in vocabulary, grammar, written form, or syntax, one would expect that embedding spaces of different languages have different structures especially for distant languages. With the goal of capturing such differences, we propose a method for learning neighborhood sensitive maps, NORMA. Our experiments show that NORMA outperforms current state-of-the-art methods for word translation between distant languages.
Anthology ID:
D18-1047
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
512–522
Language:
URL:
https://aclanthology.org/D18-1047
DOI:
10.18653/v1/D18-1047
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/D18-1047.pdf