Laís Nuto Rossman
2026
Evaluating Automated Scoring Models on Official ENEM Essays
Laís Nuto Rossman | Igor Cataneo Silveira | Denis Deratani Mauá
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Laís Nuto Rossman | Igor Cataneo Silveira | Denis Deratani Mauá
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Automated Essay Scoring systems can relieve teachers of this laborious task and allow students to practice more frequently due to faster feedback cycles. In Brazilian Portuguese, there is growing interest in automatic scoring systems for the standardized ENEM exam. However, the only available datasets consist of essays written as practice for the official exam. In the literature, to the best of our knowledge, there is no work that evaluates official ENEM essays using mock-exam datasets.This work fills that gap by presenting a new labeled dataset composed of 157 essays written for the official ENEM exam. The analysis shows that this dataset shares characteristics similar to existing datasets of mock exam essays. The results also indicate that, for small datasets such as this one, the use of LLMs pretrained on mock exams significantly improves the performance of automatic scorers for official ENEM essays, yielding an average gain of 0.27 points in the Quadratic Weighted Kappa metric compared to training solely on official data.