Rishikesh Fulari
2024
Utilizing Machine Learning to Predict Question Difficulty and Response Time for Enhanced Test Construction
Rishikesh Fulari
|
Jonathan Rusert
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
In this paper, we present the details of ourcontribution to the BEA Shared Task on Automated Prediction of Item Difficulty and Response Time. Participants in this collaborativeeffort are tasked with developing models to predict the difficulty and response time of multiplechoice items within the medical domain. Theseitems are sourced from the United States Medical Licensing Examination® (USMLE®), asignificant medical assessment. In order toachieve this, we experimented with two featurization techniques, one using lingusitic features and the other using embeddings generated by BERT fine-tuned over MS-MARCOdataset. Further, we tried several different machine learning models such as Linear Regression, Decision Trees, KNN and Boosting models such as XGBoost and GBDT. We found thatout of all the models we experimented withRandom Forest Regressor trained on Linguisticfeatures gave the least root mean squared error.