%0 Conference Proceedings
%T multiPRover: Generating Multiple Proofs for Improved Interpretability in Rule Reasoning
%A Saha, Swarnadeep
%A Yadav, Prateek
%A Bansal, Mohit
%S Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
%D 2021
%8 June
%I Association for Computational Linguistics
%C Online
%F saha-etal-2021-multiprover
%X We focus on a type of linguistic formal reasoning where the goal is to reason over explicit knowledge in the form of natural language facts and rules (Clark et al., 2020). A recent work, named PRover (Saha et al., 2020), performs such reasoning by answering a question and also generating a proof graph that explains the answer. However, compositional reasoning is not always unique and there may be multiple ways of reaching the correct answer. Thus, in our work, we address a new and challenging problem of generating multiple proof graphs for reasoning over natural language rule-bases. Each proof provides a different rationale for the answer, thereby improving the interpretability of such reasoning systems. In order to jointly learn from all proof graphs and exploit the correlations between multiple proofs for a question, we pose this task as a set generation problem over structured output spaces where each proof is represented as a directed graph. We propose two variants of a proof-set generation model, multiPRover. Our first model, Multilabel-multiPRover, generates a set of proofs via multi-label classification and implicit conditioning between the proofs; while the second model, Iterative-multiPRover, generates proofs iteratively by explicitly conditioning on the previously generated proofs. Experiments on multiple synthetic, zero-shot, and human-paraphrased datasets reveal that both multiPRover models significantly outperform PRover on datasets containing multiple gold proofs. Iterative-multiPRover obtains state-of-the-art proof F1 in zero-shot scenarios where all examples have single correct proofs. It also generalizes better to questions requiring higher depths of reasoning where multiple proofs are more frequent.
%R 10.18653/v1/2021.naacl-main.287
%U https://aclanthology.org/2021.naacl-main.287
%U https://doi.org/10.18653/v1/2021.naacl-main.287
%P 3662-3677