Explaining Text Similarity in Transformer Models

Alexandros Vasileiou, Oliver Eberle


Abstract
As Transformers have become state-of-the-art models for natural language processing (NLP) tasks, the need to understand and explain their predictions is increasingly apparent. Especially in unsupervised applications, such as information retrieval tasks, similarity models built on top of foundation model representations have been widely applied. However, their inner prediction mechanisms have mostly remained opaque. Recent advances in explainable AI have made it possible to mitigate these limitations by leveraging improved explanations for Transformers through layer-wise relevance propagation (LRP). Using BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, we investigate which feature interactions drive similarity in NLP models. We validate the resulting explanations and demonstrate their utility in three corpus-level use cases, analyzing grammatical interactions, multilingual semantics, and biomedical text retrieval. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
Anthology ID:
2024.naacl-long.435
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7852–7866
Language:
URL:
https://aclanthology.org/2024.naacl-long.435
DOI:
Bibkey:
Cite (ACL):
Alexandros Vasileiou and Oliver Eberle. 2024. Explaining Text Similarity in Transformer Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 7852–7866, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Explaining Text Similarity in Transformer Models (Vasileiou & Eberle, NAACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.naacl-long.435.pdf
Copyright:
 2024.naacl-long.435.copyright.pdf