Automated Prediction of Examinee Proficiency from Short-Answer Questions
Le An Ha | Victoria Yaneva | Polina Harik | Ravi Pandian | Amy Morales | Brian Clauser
Proceedings of the 28th International Conference on Computational Linguistics
This paper brings together approaches from the fields of NLP and psychometric measurement to address the problem of predicting examinee proficiency from responses to short-answer questions (SAQs). While previous approaches train on manually labeled data to predict the human-ratings assigned to SAQ responses, the approach presented here models examinee proficiency directly and does not require manually labeled data to train on. We use data from a large medical exam where experimental SAQ items are embedded alongside 106 scored multiple-choice questions (MCQs). First, the latent trait of examinee proficiency is measured using the scored MCQs and then a model is trained on the experimental SAQ responses as input, aiming to predict proficiency as its target variable. The predicted value is then used as a “score” for the SAQ response and evaluated in terms of its contribution to the precision of proficiency estimation.