Understanding Token Probability Encoding in Output Embeddings

Hakaze Cho, Yoshihiro Sakai, Kenshiro Tanaka, Mariko Kato, Naoya Inoue


Abstract
In this paper, we investigate the output token probability information in the output embedding of language models. We find an approximate common log-linear encoding of output token probabilities within the output embedding vectors and empirically demonstrate that it is accurate and sparse. As a causality examination, we steer the encoding in output embedding to modify the output probability distribution accurately. Moreover, the sparsity we find in output probability encoding suggests that a large number of dimensions in the output embedding do not contribute to causal language modeling. Therefore, we attempt to delete the output-unrelated dimensions and find more than 30% of the dimensions can be deleted without significant movement in output distribution and sequence generation. Additionally, in the pre-training dynamics of language models, we find that the output embeddings capture the corpus token frequency information in early steps, even before an obvious convergence of parameters starts.
Anthology ID:
2025.coling-main.708
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10618–10633
Language:
URL:
https://aclanthology.org/2025.coling-main.708/
DOI:
Bibkey:
Cite (ACL):
Hakaze Cho, Yoshihiro Sakai, Kenshiro Tanaka, Mariko Kato, and Naoya Inoue. 2025. Understanding Token Probability Encoding in Output Embeddings. In Proceedings of the 31st International Conference on Computational Linguistics, pages 10618–10633, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Understanding Token Probability Encoding in Output Embeddings (Cho et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.708.pdf