A Simple and Effective Model for Answering Multi-span Questions

Elad Segal, Avia Efrat, Mor Shoham, Amir Globerson, Jonathan Berant


Abstract
Models for reading comprehension (RC) commonly restrict their output space to the set of all single contiguous spans from the input, in order to alleviate the learning problem and avoid the need for a model that generates text explicitly. However, forcing an answer to be a single span can be restrictive, and some recent datasets also include multi-span questions, i.e., questions whose answer is a set of non-contiguous spans in the text. Naturally, models that return single spans cannot answer these questions. In this work, we propose a simple architecture for answering multi-span questions by casting the task as a sequence tagging problem, namely, predicting for each input token whether it should be part of the output or not. Our model substantially improves performance on span extraction questions from DROP and Quoref by 9.9 and 5.5 EM points respectively.
Anthology ID:
2020.emnlp-main.248
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Editors:
Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3074–3080
Language:
URL:
https://aclanthology.org/2020.emnlp-main.248
DOI:
10.18653/v1/2020.emnlp-main.248
Bibkey:
Cite (ACL):
Elad Segal, Avia Efrat, Mor Shoham, Amir Globerson, and Jonathan Berant. 2020. A Simple and Effective Model for Answering Multi-span Questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3074–3080, Online. Association for Computational Linguistics.
Cite (Informal):
A Simple and Effective Model for Answering Multi-span Questions (Segal et al., EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.248.pdf
Code
 eladsegal/tag-based-multi-span-extraction +  additional community code
Data
DROPQuoref