Representational Isomorphism and Alignment of Multilingual Large Language Models

Di Wu, Yibin Lei, Andrew Yates, Christof Monz


Abstract
In this extended abstract, we investigate the capability of Large Language Models (LLMs) to represent texts in multilingual contexts. Our findings reveal that sentence representations derived from LLMs exhibit a high degree of isomorphism across languages. This existing isomorphism facilitates representational alignments in few-shot settings. Specifically, by applying a contrastive objective at the representation level with only a small number (e.g., 100) of translation pairs, we significantly improve models’ performance on Semantic Textual Similarity (STS) tasks across languages.
Anthology ID:
2024.mrl-1.24
Volume:
Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024)
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Jonne Sälevä, Abraham Owodunni
Venue:
MRL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
293–297
Language:
URL:
https://aclanthology.org/2024.mrl-1.24
DOI:
Bibkey:
Cite (ACL):
Di Wu, Yibin Lei, Andrew Yates, and Christof Monz. 2024. Representational Isomorphism and Alignment of Multilingual Large Language Models. In Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024), pages 293–297, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Representational Isomorphism and Alignment of Multilingual Large Language Models (Wu et al., MRL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.mrl-1.24.pdf