The Echoes of the ‘I’: Tracing Identity with Demographically Enhanced Word Embeddings

Ivan Smirnov


Abstract
Identity is one of the most commonly studied constructs in social science. However, despite extensive theoretical work on identity, there remains a need for additional empirical data to validate and refine existing theories. This paper introduces a novel approach to studying identity by enhancing word embeddings with socio-demographic information. As a proof of concept, we demonstrate that our approach successfully reproduces and extends established findings regarding gendered self-views. Our methodology can be applied in a wide variety of settings, allowing researchers to tap into a vast pool of naturally occurring data, such as social media posts. Unlike similar methods already introduced in computer science, our approach allows for the study of differences between social groups. This could be particularly appealing to social scientists and may encourage the faster adoption of computational methods in the field.
Anthology ID:
2024.cpss-1.9
Volume:
Proceedings of the 4th Workshop on Computational Linguistics for the Political and Social Sciences: Long and short papers
Month:
Sep
Year:
2024
Address:
Vienna, Austria
Editors:
Christopher Klamm, Gabriella Lapesa, Simone Paolo Ponzetto, Ines Rehbein, Indira Sen
Venues:
cpss | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
112–118
Language:
URL:
https://aclanthology.org/2024.cpss-1.9
DOI:
Bibkey:
Cite (ACL):
Ivan Smirnov. 2024. The Echoes of the ‘I’: Tracing Identity with Demographically Enhanced Word Embeddings. In Proceedings of the 4th Workshop on Computational Linguistics for the Political and Social Sciences: Long and short papers, pages 112–118, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
The Echoes of the ‘I’: Tracing Identity with Demographically Enhanced Word Embeddings (Smirnov, cpss-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.cpss-1.9.pdf