LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models

Igor Tufanov; Karen Hambardzumyan; Javier Ferrando; Elena Voita

doi:10.18653/v1/2024.acl-demos.6

LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models

Igor Tufanov, Karen Hambardzumyan, Javier Ferrando, Elena Voita

Abstract

We present the LM Transparency Tool (LM-TT), an open-source interactive toolkit for analyzing the internal workings of Transformer-based language models. Differently from previously existing tools that focus on isolated parts of the decision-making process, our framework is designed to make the entire prediction process transparent, and allows tracing back model behavior from the top-layer representation to very fine-grained parts of the model. Specifically, it (i) shows the important part of the whole input-to-output information flow, (ii) allows attributing any changes done by a model block to individual attention heads and feed-forward neurons, (iii) allows interpreting the functions of those heads or neurons. A crucial part of this pipeline is showing the importance of specific model components at each step. As a result, we are able to look at the roles of model components only in cases where they are important for a prediction. Since knowing which components should be inspected is key for analyzing large models where the number of these components is extremely high, we believe our tool will greatly support the interpretability community both in research settings and in practical applications.

Anthology ID:: 2024.acl-demos.6
Volume:: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Yixin Cao, Yang Feng, Deyi Xiong
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 51–60
Language:
URL:: https://aclanthology.org/2024.acl-demos.6
DOI:: 10.18653/v1/2024.acl-demos.6
Bibkey:
Cite (ACL):: Igor Tufanov, Karen Hambardzumyan, Javier Ferrando, and Elena Voita. 2024. LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 51–60, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models (Tufanov et al., ACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.acl-demos.6.pdf

PDF Cite Search