Optimal and efficient text counterfactuals using Graph Neural Networks

Dimitris Lymperopoulos, Maria Lymperaiou, Giorgos Filandrianos, Giorgos Stamou


Abstract
As NLP models become increasingly integral to decision-making processes, the need for explainability and interpretability has become paramount. In this work, we propose a framework that achieves the aforementioned by generating semantically edited inputs, known as counterfactual interventions, which change the model prediction, thus providing a form of counterfactual explanations for the model. We frame the search for optimal counterfactual interventions as a graph assignment problem and employ a GNN to solve it, thus achieving high efficiency. We test our framework on two NLP tasks - binary sentiment classification and topic classification - and show that the generated edits are contrastive, fluent and minimal, while the whole process remains significantly faster than other state-of-the-art counterfactual editors.
Anthology ID:
2024.blackboxnlp-1.1
Volume:
Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP
Month:
November
Year:
2024
Address:
Miami, Florida, US
Editors:
Yonatan Belinkov, Najoung Kim, Jaap Jumelet, Hosein Mohebbi, Aaron Mueller, Hanjie Chen
Venue:
BlackboxNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–14
Language:
URL:
https://aclanthology.org/2024.blackboxnlp-1.1
DOI:
Bibkey:
Cite (ACL):
Dimitris Lymperopoulos, Maria Lymperaiou, Giorgos Filandrianos, and Giorgos Stamou. 2024. Optimal and efficient text counterfactuals using Graph Neural Networks. In Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 1–14, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):
Optimal and efficient text counterfactuals using Graph Neural Networks (Lymperopoulos et al., BlackboxNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.blackboxnlp-1.1.pdf