Discovering Differences in the Representation of People using Contextualized Semantic Axes

Li Lucy, Divya Tadimeti, David Bamman


Abstract
A common paradigm for identifying semantic differences across social and temporal contexts is the use of static word embeddings and their distances. In particular, past work has compared embeddings against “semantic axes” that represent two opposing concepts. We extend this paradigm to BERT embeddings, and construct contextualized axes that mitigate the pitfall where antonyms have neighboring representations. We validate and demonstrate these axes on two people-centric datasets: occupations from Wikipedia, and multi-platform discussions in extremist, men’s communities over fourteen years. In both studies, contextualized semantic axes can characterize differences among instances of the same word type. In the latter study, we show that references to women and the contexts around them have become more detestable over time.
Anthology ID:
2022.emnlp-main.228
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3477–3494
Language:
URL:
https://aclanthology.org/2022.emnlp-main.228
DOI:
10.18653/v1/2022.emnlp-main.228
Bibkey:
Cite (ACL):
Li Lucy, Divya Tadimeti, and David Bamman. 2022. Discovering Differences in the Representation of People using Contextualized Semantic Axes. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3477–3494, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Discovering Differences in the Representation of People using Contextualized Semantic Axes (Lucy et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.228.pdf