SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset

Niyazi Ahmet Metin; Sevde Yılmaz; Osman Enes Erdoğdu; Elif Sude Meydan; Oğul Sümer; Dilara Keküllüoğlu

SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset

Niyazi Ahmet Metin, Sevde Yılmaz, Osman Enes Erdoğdu, Elif Sude Meydan, Oğul Sümer, Dilara Keküllüoğlu

Abstract

Sarcasm is a colloquial form of language that is used to convey messages in a non-literal way, which affects the performance of many NLP tasks. Sarcasm detection is not trivial and existing work mainly focus on only English. We present SarcasTürk, a context-aware Turkish sarcasm detection dataset built from Ekşi Sözlük entries, a large-scale Turkish online discussion platform where people frequently use sarcasm. SarcasTürk contains 1,515 entries from 98 titles with binary sarcasm labels and a title-level context field created to support comparisons between entry-only and context-aware models. We generate these contexts by selecting representative sentences from all entries under a title using summarization techniques. We report baseline results for a fine-tuned BERTurk classifier and zero-shot LLMs under both no-context and context-aware conditions. We find that BERTurk model with title-level context has the best performance with 0.76 accuracy and balanced class-wise F1 scores (0.77 for sarcasm, 0.75 for no sarcasm). SarcasTürk can be shared upon contacting the authors since the dataset contains potentially sensitive and offensive language.

Anthology ID:: 2026.sigturk-1.6
Volume:: Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Kemal Oflazer, Abdullatif Köksal, Onur Varol
Venues:: SIGTURK | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 61–71
Language:
URL:: https://aclanthology.org/2026.sigturk-1.6/
DOI:
Bibkey:
Cite (ACL):: Niyazi Ahmet Metin, Sevde Yılmaz, Osman Enes Erdoğdu, Elif Sude Meydan, Oğul Sümer, and Dilara Keküllüoğlu. 2026. SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset. In Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026), pages 61–71, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset (Metin et al., SIGTURK 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.sigturk-1.6.pdf

PDF Cite Search Fix data