SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset

Niyazi Ahmet Metin, Sevde Yılmaz, Osman Enes Erdoğdu, Elif Sude Meydan, Oğul Sümer, Dilara Keküllüoğlu


Abstract
Sarcasm is a colloquial form of language that is used to convey messages in a non-literal way, which affects the performance of many NLP tasks. Sarcasm detection is not trivial and existing work mainly focus on only English. We present SarcasTürk, a context-aware Turkish sarcasm detection dataset built from Ekşi Sözlük entries, a large-scale Turkish online discussion platform where people frequently use sarcasm. SarcasTürk contains 1,515 entries from 98 titles with binary sarcasm labels and a title-level context field created to support comparisons between entry-only and context-aware models. We generate these contexts by selecting representative sentences from all entries under a title using summarization techniques. We report baseline results for a fine-tuned BERTurk classifier and zero-shot LLMs under both no-context and context-aware conditions. We find that BERTurk model with title-level context has the best performance with 0.76 accuracy and balanced class-wise F1 scores (0.77 for sarcasm, 0.75 for no sarcasm). SarcasTürk can be shared upon contacting the authors since the dataset contains potentially sensitive and offensive language.
Anthology ID:
2026.sigturk-1.6
Volume:
Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Kemal Oflazer, Abdullatif Köksal, Onur Varol
Venues:
SIGTURK | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
61–71
Language:
URL:
https://aclanthology.org/2026.sigturk-1.6/
DOI:
Bibkey:
Cite (ACL):
Niyazi Ahmet Metin, Sevde Yılmaz, Osman Enes Erdoğdu, Elif Sude Meydan, Oğul Sümer, and Dilara Keküllüoğlu. 2026. SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset. In Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026), pages 61–71, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset (Metin et al., SIGTURK 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.sigturk-1.6.pdf