Cross-Lingual Idiom Sense Clustering in German and English

Mohammed Absar


Abstract
Idioms are expressions with non-literal and non-compositional meanings. For this reason, they pose a unique challenge for various NLP tasks including Machine Translation and Sentiment Analysis. In this paper, we propose an approach to clustering idioms in different languages by their sense. We leverage pre-trained cross-lingual transformer models and fine-tune them to produce cross-lingual vector representations of idioms according to their sense.
Anthology ID:
2023.nlp4tia-1.3
Volume:
Proceedings of the First Workshop on NLP Tools and Resources for Translation and Interpreting Applications
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Raquel Lázaro Gutiérrez, Antonio Pareja, Ruslan Mitkov
Venues:
NLP4TIA | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
13–19
Language:
URL:
https://aclanthology.org/2023.nlp4tia-1.3
DOI:
Bibkey:
Cite (ACL):
Mohammed Absar. 2023. Cross-Lingual Idiom Sense Clustering in German and English. In Proceedings of the First Workshop on NLP Tools and Resources for Translation and Interpreting Applications, pages 13–19, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Cross-Lingual Idiom Sense Clustering in German and English (Absar, NLP4TIA-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nlp4tia-1.3.pdf