Long context Automated Essay Scoring with Language Models

Christopher Ormerod, Gitit Kehat


Abstract
In this study, we evaluate several models that incorporate architectural modifications to overcome the length limitations of the standard transformer architecture using the Kaggle ASAP 2.0 dataset. The models considered in this study include fine-tuned versions of XLNet, Longformer, ModernBERT, Mamba, and Llama models.
Anthology ID:
2025.aimecon-main.5
Volume:
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers
Month:
October
Year:
2025
Address:
Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States
Editors:
Joshua Wilson, Christopher Ormerod, Magdalen Beiting Parrish
Venue:
AIME-Con
SIG:
Publisher:
National Council on Measurement in Education (NCME)
Note:
Pages:
35–42
Language:
URL:
https://aclanthology.org/2025.aimecon-main.5/
DOI:
Bibkey:
Cite (ACL):
Christopher Ormerod and Gitit Kehat. 2025. Long context Automated Essay Scoring with Language Models. In Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers, pages 35–42, Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States. National Council on Measurement in Education (NCME).
Cite (Informal):
Long context Automated Essay Scoring with Language Models (Ormerod & Kehat, AIME-Con 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.aimecon-main.5.pdf