MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms

Patrick Lee, Alain Chirino Trujillo, Diana Cuevas Plancarte, Olumide Ojo, Xinyi Liu, Iyanuoluwa Shode, Yuan Zhao, Anna Feldman, Jing Peng


Abstract
Euphemisms are found across the world’s languages, making them a universal linguistic phenomenon. As such, euphemistic data may have useful properties for computational tasks across languages. In this study, we explore this premise by training a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic “categories” such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.
Anthology ID:
2024.findings-eacl.59
Volume:
Findings of the Association for Computational Linguistics: EACL 2024
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
875–881
Language:
URL:
https://aclanthology.org/2024.findings-eacl.59
DOI:
Bibkey:
Cite (ACL):
Patrick Lee, Alain Chirino Trujillo, Diana Cuevas Plancarte, Olumide Ojo, Xinyi Liu, Iyanuoluwa Shode, Yuan Zhao, Anna Feldman, and Jing Peng. 2024. MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms. In Findings of the Association for Computational Linguistics: EACL 2024, pages 875–881, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms (Lee et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-eacl.59.pdf