Diana Cuevas Plancarte
2024
MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms
Patrick Lee
|
Alain Chirino Trujillo
|
Diana Cuevas Plancarte
|
Olumide Ojo
|
Xinyi Liu
|
Iyanuoluwa Shode
|
Yuan Zhao
|
Anna Feldman
|
Jing Peng
Findings of the Association for Computational Linguistics: EACL 2024
Euphemisms are found across the world’s languages, making them a universal linguistic phenomenon. As such, euphemistic data may have useful properties for computational tasks across languages. In this study, we explore this premise by training a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic “categories” such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.
Search
Fix data
Co-authors
- Alain Chirino Trujillo 1
- Anna Feldman 1
- Patrick Lee 1
- Xinyi Liu 1
- Olumide Ojo 1
- show all...