Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies

Mor Geva; Daniel Khashabi; Elad Segal; Tushar Khot; Dan Roth; Jonathan Berant

doi:10.1162/tacl_a_00370

Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies

Mor Geva, Daniel Khashabi, Elad Segal, Tushar Khot, Dan Roth, Jonathan Berant

Abstract

A key limitation in current datasets for multi-hop reasoning is that the required steps for answering the question are mentioned in it explicitly. In this work, we introduce StrategyQA, a question answering (QA) benchmark where the required reasoning steps are implicit in the question, and should be inferred using a strategy. A fundamental challenge in this setup is how to elicit such creative questions from crowdsourcing workers, while covering a broad range of potential strategies. We propose a data collection procedure that combines term-based priming to inspire annotators, careful control over the annotator population, and adversarial filtering for eliminating reasoning shortcuts. Moreover, we annotate each question with (1) a decomposition into reasoning steps for answering it, and (2) Wikipedia paragraphs that contain the answers to each step. Overall, StrategyQA includes 2,780 examples, each consisting of a strategy question, its decomposition, and evidence paragraphs. Analysis shows that questions in StrategyQA are short, topic-diverse, and cover a wide range of strategies. Empirically, we show that humans perform well (87%) on this task, while our best baseline reaches an accuracy of ∼ 66%.

Anthology ID:: 2021.tacl-1.21
Volume:: Transactions of the Association for Computational Linguistics, Volume 9
Month:
Year:: 2021
Address:: Cambridge, MA
Editors:: Brian Roark, Ani Nenkova
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 346–361
Language:
URL:: https://aclanthology.org/2021.tacl-1.21/
DOI:: 10.1162/tacl_a_00370
Bibkey:
Cite (ACL):: Mor Geva, Daniel Khashabi, Elad Segal, Tushar Khot, Dan Roth, and Jonathan Berant. 2021. Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies. Transactions of the Association for Computational Linguistics, 9:346–361.
Cite (Informal):: Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies (Geva et al., TACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.tacl-1.21.pdf
Video:: https://aclanthology.org/2021.tacl-1.21.mp4

PDF Cite Search Video Fix data