Multi-Adversarial Learning for Cross-Lingual Word Embeddings

Haozhou Wang; James Henderson; Paola Merlo

doi:10.18653/v1/2021.naacl-main.39

Multi-Adversarial Learning for Cross-Lingual Word Embeddings

Haozhou Wang, James Henderson, Paola Merlo

Abstract

Generative adversarial networks (GANs) have succeeded in inducing cross-lingual word embeddings - maps of matching words across languages - without supervision. Despite these successes, GANs’ performance for the difficult case of distant languages is still not satisfactory. These limitations have been explained by GANs’ incorrect assumption that source and target embedding spaces are related by a single linear mapping and are approximately isomorphic. We assume instead that, especially across distant languages, the mapping is only piece-wise linear, and propose a multi-adversarial learning method. This novel method induces the seed cross-lingual dictionary through multiple mappings, each induced to fit the mapping for one subspace. Our experiments on unsupervised bilingual lexicon induction and cross-lingual document classification show that this method improves performance over previous single-mapping methods, especially for distant languages.

Anthology ID:: 2021.naacl-main.39
Volume:: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: June
Year:: 2021
Address:: Online
Editors:: Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 463–472
Language:
URL:: https://aclanthology.org/2021.naacl-main.39
DOI:: 10.18653/v1/2021.naacl-main.39
Bibkey:
Cite (ACL):: Haozhou Wang, James Henderson, and Paola Merlo. 2021. Multi-Adversarial Learning for Cross-Lingual Word Embeddings. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 463–472, Online. Association for Computational Linguistics.
Cite (Informal):: Multi-Adversarial Learning for Cross-Lingual Word Embeddings (Wang et al., NAACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.naacl-main.39.pdf
Video:: https://aclanthology.org/2021.naacl-main.39.mp4
Data: MLDoc

PDF Cite Search Video