Better Explain Transformers by Illuminating Important Information

Linxin Song; Yan Cui; Ao Luo; Freddy Lecue; Irene Li

doi:10.18653/v1/2024.findings-eacl.138

Better Explain Transformers by Illuminating Important Information

Linxin Song, Yan Cui, Ao Luo, Freddy Lecue, Irene Li

Abstract

Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings. Prior methods explain Transformers by focusing on the raw gradient and attention as token attribution scores, where non-relevant information is often considered during explanation computation, resulting in confusing results. In this work, we propose highlighting the important information and eliminating irrelevant information by a refined information flow on top of the layer-wise relevance propagation (LRP) method. Specifically, we consider identifying syntactic and positional heads as important attention heads and focus on the relevance obtained from these important heads. Experimental results demonstrate that irrelevant information does distort output attribution scores and then should be masked during explanation computation. Compared to eight baselines on both classification and question-answering datasets, our method consistently outperforms with over 3% to 33% improvement on explanation metrics, providing superior explanation performance. Our anonymous code repository is available at: https://anonymous.4open.science/r/MLRP-E676/

Anthology ID:: 2024.findings-eacl.138
Volume:: Findings of the Association for Computational Linguistics: EACL 2024
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2048–2062
Language:
URL:: https://aclanthology.org/2024.findings-eacl.138/
DOI:: 10.18653/v1/2024.findings-eacl.138
Bibkey:
Cite (ACL):: Linxin Song, Yan Cui, Ao Luo, Freddy Lecue, and Irene Li. 2024. Better Explain Transformers by Illuminating Important Information. In Findings of the Association for Computational Linguistics: EACL 2024, pages 2048–2062, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: Better Explain Transformers by Illuminating Important Information (Song et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-eacl.138.pdf

PDF Cite Search Fix data