A Closer Look on Unsupervised Cross-lingual Word Embeddings Mapping

Kamil Pluciński, Mateusz Lango, Michał Zimniewicz


Abstract
In this work, we study the unsupervised cross-lingual word embeddings mapping method presented by Artetxe et al. (2018). First, wesuccessfully reproduced the experiments performed in the original work, finding only minor differences. Furthermore, we verified themethod’s robustness on different embedding representations and new language pairs, particularly these involving Slavic languages likePolish or Czech. We also performed an experimental analysis of the impact of the method’s parameters on the final result. Finally, welooked for an alternative way of initialization, which directly relies on the isometric assumption. Our work confirms the results presentedearlier, at the same time pointing at interesting problems occurring while using the method with different types of embeddings or onless-common language pairs.
Anthology ID:
2020.lrec-1.682
Volume:
Proceedings of the 12th Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5555–5562
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.682
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.682.pdf