Phillip Howard


pdf bib
InterpreT: An Interactive Visualization Tool for Interpreting Transformers
Vasudev Lal | Arden Ma | Estelle Aflalo | Phillip Howard | Ana Simoes | Daniel Korat | Oren Pereg | Gadi Singer | Moshe Wasserblat
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

With the increasingly widespread use of Transformer-based models for NLU/NLP tasks, there is growing interest in understanding the inner workings of these models, why they are so effective at a wide range of tasks, and how they can be further tuned and improved. To contribute towards this goal of enhanced explainability and comprehension, we present InterpreT, an interactive visualization tool for interpreting Transformer-based models. In addition to providing various mechanisms for investigating general model behaviours, novel contributions made in InterpreT include the ability to track and visualize token embeddings through each layer of a Transformer, highlight distances between certain token embeddings through illustrative plots, and identify task-related functions of attention heads by using new metrics. InterpreT is a task agnostic tool, and its functionalities are demonstrated through the analysis of model behaviours for two disparate tasks: Aspect Based Sentiment Analysis (ABSA) and the Winograd Schema Challenge (WSC).