RECONSIDER: Improved Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering

Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wen-tau Yih


Abstract
State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples. This training scheme possibly explains empirical observations that these models achieve a high recall amongst their top few predictions, but a low overall accuracy, motivating the need for answer re-ranking. We develop a successful re-ranking approach (RECONSIDER) for span-extraction tasks that improves upon the performance of MRC models, even beyond large-scale pre-training. RECONSIDER is trained on positive and negative examples extracted from high confidence MRC model predictions, and uses in-passage span annotations to perform span-focused re-ranking over a smaller candidate set. As a result, RECONSIDER learns to eliminate close false positives, achieving a new extractive state of the art on four QA tasks, with 45.5% Exact Match accuracy on Natural Questions with real user questions, and 61.7% on TriviaQA. We will release all related data, models, and code.
Anthology ID:
2021.naacl-main.100
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1280–1287
Language:
URL:
https://aclanthology.org/2021.naacl-main.100
DOI:
10.18653/v1/2021.naacl-main.100
Bibkey:
Cite (ACL):
Srinivasan Iyer, Sewon Min, Yashar Mehdad, and Wen-tau Yih. 2021. RECONSIDER: Improved Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1280–1287, Online. Association for Computational Linguistics.
Cite (Informal):
RECONSIDER: Improved Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering (Iyer et al., NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.100.pdf
Video:
 https://aclanthology.org/2021.naacl-main.100.mp4
Data
Natural QuestionsTriviaQA