Understanding Cross-Lingual Alignment—A Survey

Katharina Hämmerl, Jindřich Libovický, Alexander Fraser


Abstract
Cross-lingual alignment, the meaningful similarity of representations across languages in multilingual language models, has been an active field of research in recent years. We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field. We present different understandings of cross-lingual alignment and their limitations. We provide a qualitative summary of results from a number of surveyed papers. Finally, we discuss how these insights may be applied not only to encoder models, where this topic has been heavily studied, but also to encoder-decoder or even decoder-only models, and argue that an effective trade-off between language-neutral and language-specific information is key.
Anthology ID:
2024.findings-acl.649
Volume:
Findings of the Association for Computational Linguistics: ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10922–10943
Language:
URL:
https://aclanthology.org/2024.findings-acl.649
DOI:
10.18653/v1/2024.findings-acl.649
Bibkey:
Cite (ACL):
Katharina Hämmerl, Jindřich Libovický, and Alexander Fraser. 2024. Understanding Cross-Lingual Alignment—A Survey. In Findings of the Association for Computational Linguistics: ACL 2024, pages 10922–10943, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Understanding Cross-Lingual Alignment—A Survey (Hämmerl et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.649.pdf