Generalizing Procrustes Analysis for Better Bilingual Dictionary Induction

Yova Kementchedjhieva, Sebastian Ruder, Ryan Cotterell, Anders Søgaard


Abstract
Most recent approaches to bilingual dictionary induction find a linear alignment between the word vector spaces of two languages. We show that projecting the two languages onto a third, latent space, rather than directly onto each other, while equivalent in terms of expressivity, makes it easier to learn approximate alignments. Our modified approach also allows for supporting languages to be included in the alignment process, to obtain an even better performance in low resource settings.
Anthology ID:
K18-1021
Volume:
Proceedings of the 22nd Conference on Computational Natural Language Learning
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Anna Korhonen, Ivan Titov
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
211–220
Language:
URL:
https://aclanthology.org/K18-1021
DOI:
10.18653/v1/K18-1021
Bibkey:
Cite (ACL):
Yova Kementchedjhieva, Sebastian Ruder, Ryan Cotterell, and Anders Søgaard. 2018. Generalizing Procrustes Analysis for Better Bilingual Dictionary Induction. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 211–220, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Generalizing Procrustes Analysis for Better Bilingual Dictionary Induction (Kementchedjhieva et al., CoNLL 2018)
Copy Citation:
PDF:
https://aclanthology.org/K18-1021.pdf
Code
 YovaKem/generalized-procrustes-MUSE