The Echoes of Multilinguality: Tracing Cultural Value Shifts during Language Model Fine-tuning

Rochelle Choenni, Anne Lauscher, Ekaterina Shutova


Abstract
Texts written in different languages reflect different culturally-dependent beliefs of their writers. Thus, we expect multilingual LMs (MLMs), that are jointly trained on a concatenation of text in multiple languages, to encode different cultural values for each language. Yet, as the ‘multilinguality’ of these LMs is driven by cross-lingual sharing, we also have reason to belief that cultural values bleed over from one language into another. This limits the use of MLMs in practice, as apart from being proficient in generating text in multiple languages, creating language technology that can serve a community also requires the output of LMs to be sensitive to their biases (Naous et al. 2023). Yet, little is known about how cultural values emerge and evolve in MLMs (Hershcovich et al. 2022). We are the first to study how languages can exert influence on the cultural values encoded for different test languages, by studying how such values are revised during fine-tuning. Focusing on the fine-tuning stage allows us to study the interplay between value shifts when exposed to new linguistic experience from different data sources and languages. Lastly, we use a training data attribution method to find patterns in the fine-tuning examples, and the languages that they come from, that tend to instigate value shifts.
Anthology ID:
2024.acl-long.803
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15042–15058
Language:
URL:
https://aclanthology.org/2024.acl-long.803
DOI:
Bibkey:
Cite (ACL):
Rochelle Choenni, Anne Lauscher, and Ekaterina Shutova. 2024. The Echoes of Multilinguality: Tracing Cultural Value Shifts during Language Model Fine-tuning. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15042–15058, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
The Echoes of Multilinguality: Tracing Cultural Value Shifts during Language Model Fine-tuning (Choenni et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-long.803.pdf