Fine-grained Interpretation and Causation Analysis in Deep NLP Models

Hassan Sajjad, Narine Kokhlikyan, Fahim Dalvi, Nadir Durrani


Abstract
Deep neural networks have constantly pushed the state-of-the-art performance in natural language processing and are considered as the de-facto modeling approach in solving complex NLP tasks such as machine translation, summarization and question-answering. Despite the proven efficacy of deep neural networks at-large, their opaqueness is a major cause of concern. In this tutorial, we will present research work on interpreting fine-grained components of a neural network model from two perspectives, i) fine-grained interpretation, and ii) causation analysis. The former is a class of methods to analyze neurons with respect to a desired language concept or a task. The latter studies the role of neurons and input features in explaining the decisions made by the model. We will also discuss how interpretation methods and causation analysis can connect towards better interpretability of model prediction. Finally, we will walk you through various toolkits that facilitate fine-grained interpretation and causation analysis of neural models.
Anthology ID:
2021.naacl-tutorials.2
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorials
Month:
June
Year:
2021
Address:
Online
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5–10
Language:
URL:
https://aclanthology.org/2021.naacl-tutorials.2
DOI:
10.18653/v1/2021.naacl-tutorials.2
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-tutorials.2.pdf