Inseq: An Interpretability Toolkit for Sequence Generation Models

Gabriele Sarti; Nils Feldhus; Ludwig Sickert; Oskar Van Der Wal

doi:10.18653/v1/2023.acl-demo.40

Inseq: An Interpretability Toolkit for Sequence Generation Models

Gabriele Sarti, Nils Feldhus, Ludwig Sickert, Oskar van der Wal

Abstract

Past work in natural language processing interpretability focused mainly on popular classification tasks while largely overlooking generation settings, partly due to a lack of dedicated tools. In this work, we introduce Inseq, a Python library to democratize access to interpretability analyses of sequence generation models. Inseq enables intuitive and optimized extraction of models’ internal information and feature importance scores for popular decoder-only and encoder-decoder Transformers architectures. We showcase its potential by adopting it to highlight gender biases in machine translation models and locate factual knowledge inside GPT-2. Thanks to its extensible interface supporting cutting-edge techniques such as contrastive feature attribution, Inseq can drive future advances in explainable natural language generation, centralizing good practices and enabling fair and reproducible model evaluations.

Anthology ID:: 2023.acl-demo.40
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Danushka Bollegala, Ruihong Huang, Alan Ritter
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 421–435
Language:
URL:: https://aclanthology.org/2023.acl-demo.40
DOI:: 10.18653/v1/2023.acl-demo.40
Bibkey:
Cite (ACL):: Gabriele Sarti, Nils Feldhus, Ludwig Sickert, and Oskar van der Wal. 2023. Inseq: An Interpretability Toolkit for Sequence Generation Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 421–435, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Inseq: An Interpretability Toolkit for Sequence Generation Models (Sarti et al., ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-demo.40.pdf
Video:: https://aclanthology.org/2023.acl-demo.40.mp4

PDF Cite Search Video