Embedding Style Beyond Topics: Analyzing Dispersion Effects Across Different Language Models

Benjamin Icard, Evangelia Zve, Lila Sainero, Alice Breton, Jean-Gabriel Ganascia


Abstract
This paper analyzes how writing style affects the dispersion of embedding vectors across multiple, state-of-the-art language models. While early transformer models primarily aligned with topic modeling, this study examines the role of writing style in shaping embedding spaces. Using a literary corpus that alternates between topics and styles, we compare the sensitivity of language models across French and English. By analyzing the particular impact of style on embedding dispersion, we aim to better understand how language models process stylistic information, contributing to their overall interpretability.
Anthology ID:
2025.coling-main.236
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3511–3522
Language:
URL:
https://aclanthology.org/2025.coling-main.236/
DOI:
Bibkey:
Cite (ACL):
Benjamin Icard, Evangelia Zve, Lila Sainero, Alice Breton, and Jean-Gabriel Ganascia. 2025. Embedding Style Beyond Topics: Analyzing Dispersion Effects Across Different Language Models. In Proceedings of the 31st International Conference on Computational Linguistics, pages 3511–3522, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Embedding Style Beyond Topics: Analyzing Dispersion Effects Across Different Language Models (Icard et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.236.pdf