Assessing Distractors in Multiple-Choice Tests

Vatsal Raina, Adian Liusie, Mark Gales


Abstract
Multiple-choice tests are a common approach for assessing candidates’ comprehension skills. Standard multiple-choice reading comprehension exams require candidates to select the correct answer option from a discrete set based on a question in relation to a contextual passage. For appropriate assessment, the distractor answer options must by definition be incorrect but plausible and diverse. However, generating good quality distractors satisfying these criteria is a challenging task for content creators. We propose automated assessment metrics for the quality of distractors in multiple-choice reading comprehension tests. Specifically, we define quality in terms of the incorrectness, plausibility and diversity of the distractor options. We assess incorrectness using the classification ability of a binary multiple-choice reading comprehension system. Plausibility is assessed by considering the distractor confidence - the probability mass associated with the distractor options for a standard multi-class multiple-choice reading comprehension system. Diversity is assessed by pairwise comparison of an embedding-based equivalence metric between the distractors of a question. To further validate the plausibility metric we compare against candidate distributions over multiple-choice questions and agreement with a ChatGPT model’s interpretation of distractor plausibility and diversity.
Anthology ID:
2023.eval4nlp-1.2
Volume:
Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems
Month:
November
Year:
2023
Address:
Bali, Indonesia
Editors:
Daniel Deutsch, Rotem Dror, Steffen Eger, Yang Gao, Christoph Leiter, Juri Opitz, Andreas Rücklé
Venues:
Eval4NLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12–22
Language:
URL:
https://aclanthology.org/2023.eval4nlp-1.2
DOI:
10.18653/v1/2023.eval4nlp-1.2
Bibkey:
Cite (ACL):
Vatsal Raina, Adian Liusie, and Mark Gales. 2023. Assessing Distractors in Multiple-Choice Tests. In Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems, pages 12–22, Bali, Indonesia. Association for Computational Linguistics.
Cite (Informal):
Assessing Distractors in Multiple-Choice Tests (Raina et al., Eval4NLP-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.eval4nlp-1.2.pdf