Multilingual Factor Analysis

Francisco Vargas, Kamen Brestnichki, Alex Papadopoulos Korfiatis, Nils Hammerla


Abstract
In this work we approach the task of learning multilingual word representations in an offline manner by fitting a generative latent variable model to a multilingual dictionary. We model equivalent words in different languages as different views of the same word generated by a common latent variable representing their latent lexical meaning. We explore the task of alignment by querying the fitted model for multilingual embeddings achieving competitive results across a variety of tasks. The proposed model is robust to noise in the embedding space making it a suitable method for distributed representations learned from noisy corpora.
Anthology ID:
P19-1170
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1738–1750
Language:
URL:
https://aclanthology.org/P19-1170
DOI:
10.18653/v1/P19-1170
Bibkey:
Cite (ACL):
Francisco Vargas, Kamen Brestnichki, Alex Papadopoulos Korfiatis, and Nils Hammerla. 2019. Multilingual Factor Analysis. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1738–1750, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Multilingual Factor Analysis (Vargas et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1170.pdf
Supplementary:
 P19-1170.Supplementary.pdf
Video:
 https://vimeo.com/384494210
Code
 Babylonpartners/MultilingualFactorAnalysis