Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering

Hao Cheng, Ming-Wei Chang, Kenton Lee, Kristina Toutanova


Abstract
We address the problem of extractive question answering using document-level distant super-vision, pairing questions and relevant documents with answer strings. We compare previously used probability space and distant supervision assumptions (assumptions on the correspondence between the weak answer string labels and possible answer mention spans). We show that these assumptions interact, and that different configurations provide complementary benefits. We demonstrate that a multi-objective model can efficiently combine the advantages of multiple assumptions and outperform the best individual formulation. Our approach outperforms previous state-of-the-art models by 4.3 points in F1 on TriviaQA-Wiki and 1.7 points in Rouge-L on NarrativeQA summaries.
Anthology ID:
2020.acl-main.501
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5657–5667
Language:
URL:
https://aclanthology.org/2020.acl-main.501
DOI:
10.18653/v1/2020.acl-main.501
Bibkey:
Cite (ACL):
Hao Cheng, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2020. Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5657–5667, Online. Association for Computational Linguistics.
Cite (Informal):
Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering (Cheng et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.501.pdf
Video:
 http://slideslive.com/38929404
Code
 hao-cheng/ds_doc_qa
Data
NarrativeQATriviaQA