DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions

Nigel Fernandez, Alexander Scarlatos, Wanyong Feng, Simon Woodhead, Andrew Lan


Abstract
High-quality distractors are crucial to both the assessment and pedagogical value of multiple-choice questions (MCQs), where manually crafting ones that anticipate knowledge deficiencies or misconceptions among real students is difficult. Meanwhile, automated distractor generation, even with the help of large language models (LLMs), remains challenging for subjects like math. It is crucial to not only identify plausible distractors but also understand the error behind them. In this paper, we introduce DiVERT (Distractor Generation with Variational Errors Represented as Text), a novel variational approach that learns an interpretable representation of errors behind distractors in math MCQs. Through experiments on a real-world math MCQ dataset with 1,434 questions used by hundreds of thousands of students, we show that DiVERT, despite using a base open-source LLM with 7B parameters, outperforms state-of-the-art approaches using GPT-4o on downstream distractor generation. We also conduct a human evaluation with math educators and find that DiVERT leads to error labels that are of comparable quality to human-authored ones.
Anthology ID:
2024.emnlp-main.512
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9063–9081
Language:
URL:
https://aclanthology.org/2024.emnlp-main.512
DOI:
Bibkey:
Cite (ACL):
Nigel Fernandez, Alexander Scarlatos, Wanyong Feng, Simon Woodhead, and Andrew Lan. 2024. DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 9063–9081, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions (Fernandez et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.512.pdf