When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data

Peter Hase, Mohit Bansal


Abstract
Many methods now exist for conditioning models on task instructions and user-provided explanations for individual data points. These methods show great promise for improving task performance of language models beyond what can be achieved by learning from individual (x,y) pairs. In this paper, we (1) provide a formal framework for characterizing approaches to learning from explanation data, and (2) we propose a synthetic task for studying how models learn from explanation data. In the first direction, we give graphical models for the available modeling approaches, in which explanation data can be used as model inputs, as targets, or as a prior. In the second direction, we introduce a carefully designed synthetic task with several properties making it useful for studying a model’s ability to learn from explanation data. Each data point in this binary classification task is accompanied by a string that is essentially an answer to the why question: “why does data point x have label y?” We aim to encourage research into this area by identifying key considerations for the modeling problem and providing an empirical testbed for theories of how models can best learn from explanation data.
Anthology ID:
2022.lnls-1.4
Volume:
Proceedings of the First Workshop on Learning with Natural Language Supervision
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Jacob Andreas, Karthik Narasimhan, Aida Nematzadeh
Venue:
LNLS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
29–39
Language:
URL:
https://aclanthology.org/2022.lnls-1.4
DOI:
10.18653/v1/2022.lnls-1.4
Bibkey:
Cite (ACL):
Peter Hase and Mohit Bansal. 2022. When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data. In Proceedings of the First Workshop on Learning with Natural Language Supervision, pages 29–39, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data (Hase & Bansal, LNLS 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lnls-1.4.pdf
Video:
 https://aclanthology.org/2022.lnls-1.4.mp4
Code
 peterbhase/ExplanationRoles
Data
SNLITACREDe-SNLI