Beyond Surprisal: A Dual Metric Framework for Lexical Skill Acquisition in LLMs

Nazanin Shafiabadi, Guillaume Wisniewski


Abstract
Many studies have explored when and how LLMs learn to use specific words, primarily by examining their learning curves. While these curves capture a model’s capacity to use words correctly in context, they often neglect the equally important skill of avoiding incorrect usage. In this paper, we introduce a new metric, anti-surprisal, which measures a model’s capacity to refrain from using words in inappropriate or unexpected contexts. By examining both correct usage and error avoidance, we offer a more comprehensive perspective on the learning dynamics of LLMs.
Anthology ID:
2025.coling-main.443
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6636–6641
Language:
URL:
https://aclanthology.org/2025.coling-main.443/
DOI:
Bibkey:
Cite (ACL):
Nazanin Shafiabadi and Guillaume Wisniewski. 2025. Beyond Surprisal: A Dual Metric Framework for Lexical Skill Acquisition in LLMs. In Proceedings of the 31st International Conference on Computational Linguistics, pages 6636–6641, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Beyond Surprisal: A Dual Metric Framework for Lexical Skill Acquisition in LLMs (Shafiabadi & Wisniewski, COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.443.pdf