Class Explanations: the Role of Domain-Specific Content and Stop Words

Denitsa Saynova, Bastiaan Bruinsma, Moa Johansson, Richard Johansson


Abstract
We address two understudied areas related to explainability for neural text models. First, class explanations. What features are descriptive across a class, rather than explaining single input instances? Second, the type of features that are used for providing explanations. Does the explanation involve the statistical pattern of word usage or the presence of domain-specific content words? Here, we present a method to extract both class explanations and strategies to differentiate between two types of explanations – domain-specific signals or statistical variations in frequencies of common words. We demonstrate our method using a case study in which we analyse transcripts of political debates in the Swedish Riksdag.
Anthology ID:
2023.nodalida-1.12
Volume:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May
Year:
2023
Address:
Tórshavn, Faroe Islands
Editors:
Tanel Alumäe, Mark Fishel
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
103–112
Language:
URL:
https://aclanthology.org/2023.nodalida-1.12
DOI:
Bibkey:
Cite (ACL):
Denitsa Saynova, Bastiaan Bruinsma, Moa Johansson, and Richard Johansson. 2023. Class Explanations: the Role of Domain-Specific Content and Stop Words. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 103–112, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):
Class Explanations: the Role of Domain-Specific Content and Stop Words (Saynova et al., NoDaLiDa 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nodalida-1.12.pdf