Alankaar: A Dataset for Figurativeness Understanding in Bangla

Geetanjali Rakshit, Jeffrey Flanigan


Abstract
Bangla has a rich written literature, automatically making it replete with examples of creative usage of language. There have been limited efforts to computationally analyze creative text in the Bangla language due to a lack of resources. We present Alankaar, a dataset of 2500 manually annotated examples of text fragments in Bangla containing metaphors. We also provide automatic and manual English translations of these examples. Additionally, we provide 2500 examples of non-metaphorical text in Bangla. We use this dataset to build a metaphor identification system in Bangla. We also use it as a test bed for cross-lingual metaphor translation, finding that not all metaphors translate literally across languages and there are several cultural factors at play in the translation of metaphors. We hope this will advance the field in metaphor translation research and in grounding cultural nuances at work in the process of machine translation.
Anthology ID:
2025.ranlp-1.114
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
998–1002
Language:
URL:
https://aclanthology.org/2025.ranlp-1.114/
DOI:
Bibkey:
Cite (ACL):
Geetanjali Rakshit and Jeffrey Flanigan. 2025. Alankaar: A Dataset for Figurativeness Understanding in Bangla. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 998–1002, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Alankaar: A Dataset for Figurativeness Understanding in Bangla (Rakshit & Flanigan, RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.114.pdf