Ummugul Bezirhan


2025

pdf bib
Input Optimization for Automated Scoring in Reading Assessment
Ji Yoon Jung | Ummugul Bezirhan | Matthias von Davier
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers

This study examines input optimization for enhanced efficiency in automated scoring (AS) of reading assessments, which typically involve lengthy passages and complex scoring guides. We propose optimizing input size using question-specific summaries and simplified scoring guides. Findings indicate that input optimization via compression is achievable while maintaining AS performance.

pdf bib
Optimizing Reliability Scoring for ILSAs
Ji Yoon Jung | Ummugul Bezirhan | Matthias von Davier
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers

This study proposes an innovative method for evaluating cross-country scoring reliability (CCSR) in multilingual assessments, using hyperparameter optimization and a similarity-based weighted majority scoring within a single human scoring framework. Results show that this approach provides a cost-effective and comprehensive assessment of CCSR without the need for additional raters.

pdf bib
AI-Based Classification of TIMSS Items for Framework Alignment
Ummugul Bezirhan | Matthias von Davier
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers

Large-scale assessments rely on expert panels to verify that test items align with prescribed frameworks, a labor-intensive process. This study evaluates the use of GPT-4o to classify TIMSS items to content domain, cognitive domain, and difficulty categories. Findings highlight the potential of language models to support scalable, framework-aligned item verification.