Shayan Zargari
2024
Using Machine Learning to Predict Item Difficulty and Response Time in Medical Tests
Mehrdad Yousefpoori-Naeim
|
Shayan Zargari
|
Zahra Hatami
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
Prior knowledge of item characteristics, such as difficulty and response time, without pretesting items can substantially save time and cost in high-standard test development. Using a variety of machine learning (ML) algorithms, the present study explored several (non-)linguistic features (such as Coh-Metrix indices) along with MPNet word embeddings to predict the difficulty and response time of a sample of medical test items. In both prediction tasks, the contribution of embeddings to models already containing other features was found to be extremely limited. Moreover, a comparison of feature importance scores across the two prediction tasks revealed that cohesion-based features were the strongest predictors of difficulty, while the prediction of response time was primarily dependent on length-related features.