Evaluating Automated Scoring Models on Official ENEM Essays

Laís Nuto Rossman; Igor Cataneo Silveira; Denis Deratani Mauá

Evaluating Automated Scoring Models on Official ENEM Essays

Laís Nuto Rossman, Igor Cataneo Silveira, Denis Deratani Mauá

Abstract

Automated Essay Scoring systems can relieve teachers of this laborious task and allow students to practice more frequently due to faster feedback cycles. In Brazilian Portuguese, there is growing interest in automatic scoring systems for the standardized ENEM exam. However, the only available datasets consist of essays written as practice for the official exam. In the literature, to the best of our knowledge, there is no work that evaluates official ENEM essays using mock-exam datasets.This work fills that gap by presenting a new labeled dataset composed of 157 essays written for the official ENEM exam. The analysis shows that this dataset shares characteristics similar to existing datasets of mock exam essays. The results also indicate that, for small datasets such as this one, the use of LLMs pretrained on mock exams significantly improves the performance of automatic scorers for official ENEM essays, yielding an average gain of 0.27 points in the Quadratic Weighted Kappa metric compared to training solely on official data.

Anthology ID:: 2026.propor-1.16
Volume:: Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Month:: April
Year:: 2026
Address:: Salvador, Brazil
Editors:: Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:: PROPOR
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 161–171
Language:
URL:: https://aclanthology.org/2026.propor-1.16/
DOI:
Bibkey:
Cite (ACL):: Laís Nuto Rossman, Igor Cataneo Silveira, and Denis Deratani Mauá. 2026. Evaluating Automated Scoring Models on Official ENEM Essays. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 161–171, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):: Evaluating Automated Scoring Models on Official ENEM Essays (Rossman et al., PROPOR 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.propor-1.16.pdf

PDF Cite Search Fix data