FocusQA: Open-Domain Question Answering with a Context in Focus

Gianni Barlacchi, Ivano Lauriola, Alessandro Moschitti, Marco Del Tredici, Xiaoyu Shen, Thuy Vu, Bill Byrne, Adrià de Gispert


Abstract
We introduce question answering with a cotext in focus, a task that simulates a free interaction with a QA system. The user reads on a screen some information about a topic, and they can follow-up with questions that can be either related or not to the topic; and the answer can be found in the document containing the screen content or from other pages. We call such information context. To study the task, we construct FocusQA, a dataset for answer sentence selection (AS2) with 12,165011unique question/context pairs, and a total of 109,940 answers. To build the dataset, we developed a novel methodology that takes existing questions and pairs them with relevant contexts. To show the benefits of this approach, we present a comparative analysis with a set of questions written by humans after reading the context, showing that our approach greatly helps in eliciting more realistic question/context pairs. Finally, we show that the task poses several challenges for incorporating contextual information. In this respect, we introduce strong baselines for answer sentence selection that outperform the precision of state-of-the-art models for AS2 up to 21.3% absolute points.
Anthology ID:
2022.findings-emnlp.381
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5195–5208
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.381
DOI:
10.18653/v1/2022.findings-emnlp.381
Bibkey:
Cite (ACL):
Gianni Barlacchi, Ivano Lauriola, Alessandro Moschitti, Marco Del Tredici, Xiaoyu Shen, Thuy Vu, Bill Byrne, and Adrià de Gispert. 2022. FocusQA: Open-Domain Question Answering with a Context in Focus. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5195–5208, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
FocusQA: Open-Domain Question Answering with a Context in Focus (Barlacchi et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.381.pdf
Video:
 https://aclanthology.org/2022.findings-emnlp.381.mp4