UPSC2M: Benchmarking Adaptive Learning from Two Million MCQ Attempts

Kevin Shi; Karttikeya Mangalam

doi:10.18653/v1/2025.bea-1.70

UPSC2M: Benchmarking Adaptive Learning from Two Million MCQ Attempts

Abstract

We present UPSC2M, a large-scale dataset comprising two million multiple-choice question attempts from over 46,000 students, spanning nearly 9,000 questions across seven subject areas. The questions are drawn from the Union Public Service Commission (UPSC) examination, one of India’s most competitive and high-stakes assessments. Each attempt includes both response correctness and time taken, enabling fine-grained analysis of learner behavior and question characteristics. Over this dataset, we define two core benchmark tasks: question difficulty estimation and student performance prediction. The first task involves predicting empirical correctness rates using only question text. The second task focuses on predicting the likelihood of a correct response based on prior interactions. We evaluate simple baseline models on both tasks to demonstrate feasibility and establish reference points. Together, the dataset and benchmarks offer a strong foundation for building scalable, personalized educational systems. We release the dataset and code to support further research at the intersection of content understanding, learner modeling, and adaptive assessment.

Anthology ID:: 2025.bea-1.70
Volume:: Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Ekaterina Kochmar, Bashar Alhafni, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, Zheng Yuan
Venues:: BEA | WS
SIG:: SIGEDU
Publisher:: Association for Computational Linguistics
Note:
Pages:: 931–936
Language:
URL:: https://aclanthology.org/2025.bea-1.70/
DOI:: 10.18653/v1/2025.bea-1.70
Bibkey:
Cite (ACL):: Kevin Shi and Karttikeya Mangalam. 2025. UPSC2M: Benchmarking Adaptive Learning from Two Million MCQ Attempts. In Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025), pages 931–936, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: UPSC2M: Benchmarking Adaptive Learning from Two Million MCQ Attempts (Shi & Mangalam, BEA 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.bea-1.70.pdf

PDF Cite Search Fix data