Olgun Dursun
2026
TurkBench: A Benchmark for Evaluating Turkish Large Language Models
Cagri Toraman | Ahmet Kaan Sever | Ayşe Aysu Cengiz | Elif Ecem Arslan | Görkem Sevinç | Sarp Kantar | Mete Mert Birdal | Yusuf Faruk Güldemir | Ali Buğra Kanburoğlu | Sezen Felekoğlu | Birsen Şahin Kütük | Büşra Tufan | Elif Genç | Serkan Coşkun | Gupse Ekin Demir | Muhammed Emin Arayıcı | Olgun Dursun | Onur Gungor | Susan Üsküdarlı | Abdullah Topraksoy | Esra Darıcı
Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026)
Cagri Toraman | Ahmet Kaan Sever | Ayşe Aysu Cengiz | Elif Ecem Arslan | Görkem Sevinç | Sarp Kantar | Mete Mert Birdal | Yusuf Faruk Güldemir | Ali Buğra Kanburoğlu | Sezen Felekoğlu | Birsen Şahin Kütük | Büşra Tufan | Elif Genç | Serkan Coşkun | Gupse Ekin Demir | Muhammed Emin Arayıcı | Olgun Dursun | Onur Gungor | Susan Üsküdarlı | Abdullah Topraksoy | Esra Darıcı
Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026)
With the recent surge in the development of large language models, the need for comprehensive and language-specific evaluation benchmarks has become critical. While significant progress has been made in evaluating English-language models, benchmarks for other languages, particularly those with unique linguistic characteristics such as Turkish, remain less developed. Our study introduces TurkBench, a comprehensive benchmark designed to assess the capabilities of generative large language models in the Turkish language. TurkBench involves 8,151 data samples across 21 distinct subtasks. These are organized under six main categories of evaluation: Knowledge, Language Understanding, Reasoning, Content Moderation, Turkish Grammar and Vocabulary, and Instruction Following. The diverse range of tasks and the culturally relevant data would provide researchers and developers with a valuable tool for evaluating their models and identifying areas for improvement. We further publish our benchmark for online submissions at https://huggingface.co/turkbench
2023
Incorporating Human Translator Style into English-Turkish Literary Machine Translation
Zeynep Yirmibeşoğlu | Olgun Dursun | Harun Dalli | Mehmet Şahin | Ena Hodzik | Sabri Gürses | Tunga Güngör
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
Zeynep Yirmibeşoğlu | Olgun Dursun | Harun Dalli | Mehmet Şahin | Ena Hodzik | Sabri Gürses | Tunga Güngör
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
Although machine translation systems are mostly designed to serve in the general domain, there is a growing tendency to adapt these systems to other domains like literary translation. In this paper, we focus on English-Turkish literary translation and develop machine translation models that take into account the stylistic features of translators. We fine-tune a pre-trained machine translation model by the manually-aligned works of a particular translator. We make a detailed analysis of the effects of manual and automatic alignments, data augmentation methods, and corpus size on the translations. We propose an approach based on stylistic features to evaluate the style of a translator in the output translations. We show that the human translator style can be highly recreated in the target machine translations by adapting the models to the style of the translator.
Search
Fix author
Co-authors
- Muhammed Emin Arayıcı 1
- Elif Ecem Arslan 1
- Mete Mert Birdal 1
- Ayşe Aysu Cengiz 1
- Serkan Coşkun 1
- Harun Dalli 1
- Esra Darıcı 1
- Gupse Ekin Demir 1
- Sezen Felekoğlu 1
- Elif Genç 1
- Tunga Gungor 1
- Yusuf Faruk Güldemir 1
- Onur Güngör 1
- Sabri Gürses 1
- Ena Hodzik 1
- Ali Buğra Kanburoğlu 1
- Sarp Kantar 1
- Birsen Şahin Kütük 1
- Ahmet Kaan Sever 1
- Görkem Sevinç 1
- Abdullah Topraksoy 1
- Cagri Toraman 1
- Büşra Tufan 1
- Zeynep Yirmibeşoğlu 1
- Susan Üsküdarlı 1
- Mehmet Şahin 1