Evaluation Paradigms in Question Answering

Pedro Rodriguez, Jordan Boyd-Graber


Abstract
Question answering (QA) primarily descends from two branches of research: (1) Alan Turing’s investigation of machine intelligence at Manchester University and (2) Cyril Cleverdon’s comparison of library card catalog indices at Cranfield University. This position paper names and distinguishes these paradigms. Despite substantial overlap, subtle but significant distinctions exert an outsize influence on research. While one evaluation paradigm values creating more intelligent QA systems, the other paradigm values building QA systems that appeal to users. By better understanding the epistemic heritage of QA, researchers, academia, and industry can more effectively accelerate QA research.
Anthology ID:
2021.emnlp-main.758
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9630–9642
Language:
URL:
https://aclanthology.org/2021.emnlp-main.758
DOI:
10.18653/v1/2021.emnlp-main.758
Bibkey:
Cite (ACL):
Pedro Rodriguez and Jordan Boyd-Graber. 2021. Evaluation Paradigms in Question Answering. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 9630–9642, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Evaluation Paradigms in Question Answering (Rodriguez & Boyd-Graber, EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.758.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.758.mp4
Data
Natural Questions