Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-Answering

Harsh Jhamtani, Peter Clark


Abstract
Despite the rapid progress in multihop question-answering (QA), models still have trouble explaining why an answer is correct, with limited explanation training data available to learn from. To address this, we introduce three explanation datasets in which explanations formed from corpus facts are annotated. Our first dataset, eQASC contains over 98K explanation annotations for the multihop question answering dataset QASC, and is the first that annotates multiple candidate explanations for each answer. The second dataset eQASC-perturbed is constructed by crowd-sourcing perturbations (while preserving their validity) of a subset of explanations in QASC, to test consistency and generalization of explanation prediction models. The third dataset eOBQA is constructed by adding explanation annotations to the OBQA dataset to test generalization of models trained on eQASC. We show that this data can be used to significantly improve explanation quality (+14% absolute F1 over a strong retrieval baseline) using a BERT-based classifier, but still behind the upper bound, offering a new challenge for future research. We also explore a delexicalized chain representation in which repeated noun phrases are replaced by variables, thus turning them into generalized reasoning chains (for example: “X is a Y” AND “Y has Z” IMPLIES “X has Z”). We find that generalized chains maintain performance while also being more robust to certain perturbations.
Anthology ID:
2020.emnlp-main.10
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Editors:
Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
137–150
Language:
URL:
https://aclanthology.org/2020.emnlp-main.10
DOI:
10.18653/v1/2020.emnlp-main.10
Bibkey:
Cite (ACL):
Harsh Jhamtani and Peter Clark. 2020. Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 137–150, Online. Association for Computational Linguistics.
Cite (Informal):
Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-Answering (Jhamtani & Clark, EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.10.pdf
Code
 harsh19/Reasoning-Chains-MultihopQA
Data
eQASCHotpotQAOpenBookQAQA2DQASC