Overview of the SIGTURK 2026 Shared Task: Terminology-Aware Machine Translation for English–Turkish Scientific Texts

Ali Gebeşçe, Abdulfattah Safa, Ege Uğur Amasya, Gözde Gül Şahin


Abstract
This paper presents an overview of the SIGTURK 2026 Shared Task on Terminology-Aware Machine Translation for English-Turkish Scientific Texts. We address the critical challenge of terminological accuracy in low-resource settings by constructing the first terminology-rich English-Turkish parallel corpus, comprising 3,300 sentence pairs from STEM domains with 10,157 expert-validated term pairs. The shared task consists of three subtasks: term detection, expert-guided correction, and end-to-end post-editing. We evaluate state-of-the-art baselines (including GPT-5.2 and Claude Sonnet 4.5) alongside participant systems employing diverse strategies from fine-tuning to Retrieval-Augmented Generation (RAG). Our results highlight that while massive generalist models dominate zero-shot detection, smaller, domain-adapted models using Supervised Fine-Tuning and Reinforcement Learning can significantly outperform them in end-to-end post-editing. Furthermore, we find that rigid retrieval pipelines often disrupt fluency, whereas Chain-of-Thought prompting allows models to integrate terminology more naturally. Despite these advances, a significant gap remains between automated systems and human expert performance in strict terminology correction.
Anthology ID:
2026.sigturk-1.20
Volume:
Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Kemal Oflazer, Abdullatif Köksal, Onur Varol
Venues:
SIGTURK | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
236–247
Language:
URL:
https://aclanthology.org/2026.sigturk-1.20/
DOI:
Bibkey:
Cite (ACL):
Ali Gebeşçe, Abdulfattah Safa, Ege Uğur Amasya, and Gözde Gül Şahin. 2026. Overview of the SIGTURK 2026 Shared Task: Terminology-Aware Machine Translation for English–Turkish Scientific Texts. In Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026), pages 236–247, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Overview of the SIGTURK 2026 Shared Task: Terminology-Aware Machine Translation for English–Turkish Scientific Texts (Gebeşçe et al., SIGTURK 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.sigturk-1.20.pdf