Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations

Swarnadeep Saha, Peter Hase, Nazneen Rajani, Mohit Bansal


Abstract
Recent work on explainable NLP has shown that few-shot prompting can enable large pre-trained language models (LLMs) to generate grammatical and factual natural language explanations for data labels. In this work, we study the connection between explainability and sample hardness by investigating the following research question – “Are LLMs and humans equally good at explaining data labels for both easy and hard samples?” We answer this question by first collecting human-written explanations in the form of generalizable commonsense rules on the task of Winograd Schema Challenge (Winogrande dataset). We compare these explanations with those generated by GPT-3 while varying the hardness of the test samples as well as the in-context samples. We observe that (1) GPT-3 explanations are as grammatical as human explanations regardless of the hardness of the test samples, (2) for easy examples, GPT-3 generates highly supportive explanations but human explanations are more generalizable, and (3) for hard examples, human explanations are significantly better than GPT-3 explanations both in terms of label-supportiveness and generalizability judgements. We also find that hardness of the in-context examples impacts the quality of GPT-3 explanations. Finally, we show that the supportiveness and generalizability aspects of human explanations are also impacted by sample hardness, although by a much smaller margin than models.
Anthology ID:
2022.emnlp-main.137
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2121–2131
Language:
URL:
https://aclanthology.org/2022.emnlp-main.137
DOI:
10.18653/v1/2022.emnlp-main.137
Bibkey:
Cite (ACL):
Swarnadeep Saha, Peter Hase, Nazneen Rajani, and Mohit Bansal. 2022. Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2121–2131, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations (Saha et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.137.pdf