TimeRes: A Turkish Benchmark For Evaluating Temporal Understanding of Large Language Models

Habib Yağız Demir; Ümit Atlamaz; Susan Üsküdarlı

TimeRes: A Turkish Benchmark For Evaluating Temporal Understanding of Large Language Models

Habib Yağız Demir, Ümit Atlamaz, Susan Üsküdarlı

Abstract

Temporal information is an essential part of communication, and understanding language requires processing it effectively. Despite recent advances, Large Language Models (LLMs) still struggle with temporal understanding.Existing benchmarks primarily focus on English and underexplore how linguistic structure contributes to temporal meaning.As a result, temporal understanding in languages other than English remains largely understudied.In this paper, we introduce TimeRes, a Turkish benchmark for evaluating temporal understanding of LLMs. TimeRes aims to investigate comprehension of Reichenbach’s temporal points and reported speech through date arithmetic.Our dataset includes 4,600 questions across 4 tasks at two levels of complexity, and presents a paired question formulation to distinguish temporal discourse understanding from temporal arithmetic capabilities.We evaluated six LLMs, and demonstrated that models struggle to resolve reported speech and fail to generalize across word order variations.

Anthology ID:: 2026.eacl-srw.67
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Selene Baez Santamaria, Sai Ashish Somayajula, Atsuki Yamaguchi
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 910–920
Language:
URL:: https://aclanthology.org/2026.eacl-srw.67/
DOI:
Bibkey:
Cite (ACL):: Habib Yağız Demir, Ümit Atlamaz, and Susan Üsküdarlı. 2026. TimeRes: A Turkish Benchmark For Evaluating Temporal Understanding of Large Language Models. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 910–920, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: TimeRes: A Turkish Benchmark For Evaluating Temporal Understanding of Large Language Models (Demir et al., EACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.eacl-srw.67.pdf

PDF Cite Search Fix data