Narine Kokhlikyan

2023

Using Captum to Explain Generative Language Models
Vivek Miglani | Aobo Yang | Aram Markosyan | Diego Garcia-Olano | Narine Kokhlikyan
Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)

Captum is a comprehensive library for model explainability in PyTorch, offering a range of methods from the interpretability literature to enhance users’ understanding of PyTorch models. In this paper, we introduce new features in Captum that are specifically designed to analyze the behavior of generative language models. We provide an overview of the available functionalities and example applications of their potential for understanding learned associations within generative language models.

2021

pdf bib abs

Fine-grained Interpretation and Causation Analysis in Deep NLP Models
Hassan Sajjad | Narine Kokhlikyan | Fahim Dalvi | Nadir Durrani
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorials

Deep neural networks have constantly pushed the state-of-the-art performance in natural language processing and are considered as the de-facto modeling approach in solving complex NLP tasks such as machine translation, summarization and question-answering. Despite the proven efficacy of deep neural networks at-large, their opaqueness is a major cause of concern. In this tutorial, we will present research work on interpreting fine-grained components of a neural network model from two perspectives, i) fine-grained interpretation, and ii) causation analysis. The former is a class of methods to analyze neurons with respect to a desired language concept or a task. The latter studies the role of neurons and input features in explaining the decisions made by the model. We will also discuss how interpretation methods and causation analysis can connect towards better interpretability of model prediction. Finally, we will walk you through various toolkits that facilitate fine-grained interpretation and causation analysis of neural models.