Improving Cross-Lingual Word Embeddings by Meeting in the Middle

Yerai Doval, Jose Camacho-Collados, Luis Espinosa-Anke, Steven Schockaert


Abstract
Cross-lingual word embeddings are becoming increasingly important in multilingual NLP. Recently, it has been shown that these embeddings can be effectively learned by aligning two disjoint monolingual vector spaces through linear transformations, using no more than a small bilingual dictionary as supervision. In this work, we propose to apply an additional transformation after the initial alignment step, which moves cross-lingual synonyms towards a middle point between them. By applying this transformation our aim is to obtain a better cross-lingual integration of the vector spaces. In addition, and perhaps surprisingly, the monolingual spaces also improve by this transformation. This is in contrast to the original alignment, which is typically learned such that the structure of the monolingual spaces is preserved. Our experiments confirm that the resulting cross-lingual embeddings outperform state-of-the-art models in both monolingual and cross-lingual evaluation tasks.
Anthology ID:
D18-1027
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
294–304
Language:
URL:
https://aclanthology.org/D18-1027
DOI:
10.18653/v1/D18-1027
Bibkey:
Cite (ACL):
Yerai Doval, Jose Camacho-Collados, Luis Espinosa-Anke, and Steven Schockaert. 2018. Improving Cross-Lingual Word Embeddings by Meeting in the Middle. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 294–304, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Improving Cross-Lingual Word Embeddings by Meeting in the Middle (Doval et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1027.pdf
Code
 yeraidm/meemi