MultiReQA: A Cross-Domain Evaluation forRetrieval Question Answering Models

Mandy Guo, Yinfei Yang, Daniel Cer, Qinlan Shen, Noah Constant


Abstract
Retrieval question answering (ReQA) is the task of retrieving a sentence-level answer to a question from an open corpus (Ahmad et al.,2019).This dataset paper presents MultiReQA, a new multi-domain ReQA evaluation suite composed of eight retrieval QA tasks drawn from publicly available QA datasets. We explore systematic retrieval based evaluation and transfer learning across domains over these datasets using a number of strong base-lines including two supervised neural models, based on fine-tuning BERT and USE-QA models respectively, as well as a surprisingly effective information retrieval baseline, BM25. Five of these tasks contain both training and test data, while three contain test data only. Performing cross training on the five tasks with training data shows that while a general model covering all domains is achievable, the best performance is often obtained by training exclusively on in-domain data.
Anthology ID:
2021.adaptnlp-1.10
Volume:
Proceedings of the Second Workshop on Domain Adaptation for NLP
Month:
April
Year:
2021
Address:
Kyiv, Ukraine
Venues:
AdaptNLP | EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
94–104
Language:
URL:
https://aclanthology.org/2021.adaptnlp-1.10
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.adaptnlp-1.10.pdf
Code
 google-research-datasets/MultiReQA
Data
BioASQHotpotQANatural QuestionsReQASQuADSearchQATriviaQA