Tomoyuki Jinno

2026

Cosine Similarity as Logits?: A Scalable Knowledge Probe Using Embedding Vectors from Generative Language Models
Tomoyuki Jinno | Kazuki Hayashi | Yusuke Sakai | Hidetaka Kamigaito | Taro Watanabe
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

Recently, the use of pretrained language models (PLMs) as soft knowledge bases has gained growing interest, sparking the development of knowledge probes to evaluate their factual knowledge retrieval capabilities. However, existing knowledge probes for generative PLMs that support multi-token entities exhibit quadratic time complexity 𝒪(n²), where n corresponds to the number of candidate entities, limiting the size of knowledge graphs used for probing. To address this, we propose DEcoder Embedding-based Relational (DEER) probe, utilizing embedding vectors extracted from generative PLMs. DEER probe achieves effective time complexity of linear order 𝒪(n), supports rank-based evaluation metrics including Hit@k, handles multi-token entity names and enables probing whilst disambiguating homographic tail-entity names. We empirically show that DEER-probe correlates with existing knowledge probes, validating its probing capability, and we demonstrate the practical benefits of its improved scalability.

Co-authors

Venues

EACL1

Fix author