Quantifying Intimacy in Language

Jiaxin Pei, David Jurgens


Abstract
Intimacy is a fundamental aspect of how we relate to others in social settings. Language encodes the social information of intimacy through both topics and other more subtle cues (such as linguistic hedging and swearing). Here, we introduce a new computational framework for studying expressions of the intimacy in language with an accompanying dataset and deep learning model for accurately predicting the intimacy level of questions (Pearson r = 0.87). Through analyzing a dataset of 80.5M questions across social media, books, and films, we show that individuals employ interpersonal pragmatic moves in their language to align their intimacy with social settings. Then, in three studies, we further demonstrate how individuals modulate their intimacy to match social norms around gender, social distance, and audience, each validating key findings from studies in social psychology. Our work demonstrates that intimacy is a pervasive and impactful social dimension of language.
Anthology ID:
2020.emnlp-main.428
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5307–5326
Language:
URL:
https://aclanthology.org/2020.emnlp-main.428
DOI:
10.18653/v1/2020.emnlp-main.428
Bibkey:
Cite (ACL):
Jiaxin Pei and David Jurgens. 2020. Quantifying Intimacy in Language. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5307–5326, Online. Association for Computational Linguistics.
Cite (Informal):
Quantifying Intimacy in Language (Pei & Jurgens, EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.428.pdf
Video:
 https://slideslive.com/38939316