Unsupervised Cross-Lingual Representation Learning

Sebastian Ruder, Anders Søgaard, Ivan Vulić


Abstract
In this tutorial, we provide a comprehensive survey of the exciting recent work on cutting-edge weakly-supervised and unsupervised cross-lingual word representations. After providing a brief history of supervised cross-lingual word representations, we focus on: 1) how to induce weakly-supervised and unsupervised cross-lingual word representations in truly resource-poor settings where bilingual supervision cannot be guaranteed; 2) critical examinations of different training conditions and requirements under which unsupervised algorithms can and cannot work effectively; 3) more robust methods for distant language pairs that can mitigate instability issues and low performance for distant language pairs; 4) how to comprehensively evaluate such representations; and 5) diverse applications that benefit from cross-lingual word representations (e.g., MT, dialogue, cross-lingual sequence labeling and structured prediction applications, cross-lingual IR).
Anthology ID:
P19-4007
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Preslav Nakov, Alexis Palmer
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
31–38
Language:
URL:
https://aclanthology.org/P19-4007
DOI:
10.18653/v1/P19-4007
Bibkey:
Cite (ACL):
Sebastian Ruder, Anders Søgaard, and Ivan Vulić. 2019. Unsupervised Cross-Lingual Representation Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts, pages 31–38, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Cross-Lingual Representation Learning (Ruder et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-4007.pdf