A Robust Self-Learning Method for Fully Unsupervised Cross-Lingual Mappings of Word Embeddings: Making the Method Robustly Reproducible as Well

Nicolas Garneau, Mathieu Godbout, David Beauchemin, Audrey Durand, Luc Lamontagne


Abstract
In this paper, we reproduce the experiments of Artetxe et al. (2018b) regarding the robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. We show that the reproduction of their method is indeed feasible with some minor assumptions. We further investigate the robustness of their model by introducing four new languages that are less similar to English than the ones proposed by the original paper. In order to assess the stability of their model, we also conduct a grid search over sensible hyperparameters. We then propose key recommendations that apply to any research project in order to deliver fully reproducible research.
Anthology ID:
2020.lrec-1.681
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5546–5554
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.681
DOI:
Bibkey:
Cite (ACL):
Nicolas Garneau, Mathieu Godbout, David Beauchemin, Audrey Durand, and Luc Lamontagne. 2020. A Robust Self-Learning Method for Fully Unsupervised Cross-Lingual Mappings of Word Embeddings: Making the Method Robustly Reproducible as Well. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5546–5554, Marseille, France. European Language Resources Association.
Cite (Informal):
A Robust Self-Learning Method for Fully Unsupervised Cross-Lingual Mappings of Word Embeddings: Making the Method Robustly Reproducible as Well (Garneau et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.681.pdf
Code
 ngarneau/vecmap