Leveraging LLMs for Cognitive Skill Mapping in TIMSS Mathematics Assessment

Ruchi J Sachdeva, Jung Yeon Park


Abstract
This study evaluates ChatGPT-4’s potential to support validation of Q-matrices and analysis of complex skill–item interactions. By comparing its outputs to expert benchmarks, we assess accuracy, consistency, and limitations, offering insights into how large language models can augment expert judgment in diagnostic assessment and cognitive skill mapping.
Anthology ID:
2025.aimecon-wip.28
Volume:
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Works in Progress
Month:
October
Year:
2025
Address:
Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States
Editors:
Joshua Wilson, Christopher Ormerod, Magdalen Beiting Parrish
Venue:
AIME-Con
SIG:
Publisher:
National Council on Measurement in Education (NCME)
Note:
Pages:
223–228
Language:
URL:
https://aclanthology.org/2025.aimecon-wip.28/
DOI:
Bibkey:
Cite (ACL):
Ruchi J Sachdeva and Jung Yeon Park. 2025. Leveraging LLMs for Cognitive Skill Mapping in TIMSS Mathematics Assessment. In Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Works in Progress, pages 223–228, Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States. National Council on Measurement in Education (NCME).
Cite (Informal):
Leveraging LLMs for Cognitive Skill Mapping in TIMSS Mathematics Assessment (Sachdeva & Park, AIME-Con 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.aimecon-wip.28.pdf