Scaling up Discovery of Latent Concepts in Deep NLP Models

Majd Hawasly, Fahim Dalvi, Nadir Durrani


Abstract
Despite the revolution caused by deep NLP models, they remain black boxes, necessitating research to understand their decision-making processes. A recent work by Dalvi et al. (2022) carried out representation analysis through the lens of clustering latent spaces within pre-trained models (PLMs), but that approach is limited to small scale due to the high cost of running Agglomerative hierarchical clustering. This paper studies clustering algorithms in order to scale the discovery of encoded concepts in PLM representations to larger datasets and models. We propose metrics for assessing the quality of discovered latent concepts and use them to compare the studied clustering algorithms. We found that K-Means-based concept discovery significantly enhances efficiency while maintaining the quality of the obtained concepts. Furthermore, we demonstrate the practicality of this newfound efficiency by scaling latent concept discovery to LLMs and phrasal concepts.
Anthology ID:
2024.eacl-long.48
Volume:
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
793–806
Language:
URL:
https://aclanthology.org/2024.eacl-long.48
DOI:
Bibkey:
Cite (ACL):
Majd Hawasly, Fahim Dalvi, and Nadir Durrani. 2024. Scaling up Discovery of Latent Concepts in Deep NLP Models. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 793–806, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Scaling up Discovery of Latent Concepts in Deep NLP Models (Hawasly et al., EACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eacl-long.48.pdf
Software:
 2024.eacl-long.48.software.zip