Aysegul Gunduz


2025

pdf bib
Enhancing Essay Scoring with GPT-2 Using Back Translation Techniques
Aysegul Gunduz | Mark Gierl | Okan Bulut
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers

This study evaluates GPT-2 (small) for automated essay scoring on the ASAP dataset. Back-translation (English–Turkish–English) improved performance, especially on imbalanced sets. QWK scores peaked at 0.77. Findings highlight augmentation’s value and the need for more advanced, rubric-aware models for fairer assessment.