Probing Pre-Trained Language Models for Cross-Cultural Differences in Values

Arnav Arora, Lucie-aimée Kaffee, Isabelle Augenstein


Abstract
Language embeds information about social, cultural, and political values people hold. Prior work has explored potentially harmful social biases encoded in Pre-trained Language Models (PLMs). However, there has been no systematic study investigating how values embedded in these models vary across cultures. In this paper, we introduce probes to study which cross-cultural values are embedded in these models, and whether they align with existing theories and cross-cultural values surveys. We find that PLMs capture differences in values across cultures, but those only weakly align with established values surveys. We discuss implications of using mis-aligned models in cross-cultural settings, as well as ways of aligning PLMs with values surveys.
Anthology ID:
2023.c3nlp-1.12
Volume:
Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Sunipa Dev, Vinodkumar Prabhakaran, David Adelani, Dirk Hovy, Luciana Benotti
Venue:
C3NLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
114–130
Language:
URL:
https://aclanthology.org/2023.c3nlp-1.12
DOI:
10.18653/v1/2023.c3nlp-1.12
Bibkey:
Cite (ACL):
Arnav Arora, Lucie-aimée Kaffee, and Isabelle Augenstein. 2023. Probing Pre-Trained Language Models for Cross-Cultural Differences in Values. In Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP), pages 114–130, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Probing Pre-Trained Language Models for Cross-Cultural Differences in Values (Arora et al., C3NLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.c3nlp-1.12.pdf
Video:
 https://aclanthology.org/2023.c3nlp-1.12.mp4