Automatically Inferring Gender Associations from Language

Serina Chang, Kathy McKeown


Abstract
In this paper, we pose the question: do people talk about women and men in different ways? We introduce two datasets and a novel integration of approaches for automatically inferring gender associations from language, discovering coherent word clusters, and labeling the clusters for the semantic concepts they represent. The datasets allow us to compare how people write about women and men in two different settings – one set draws from celebrity news and the other from student reviews of computer science professors. We demonstrate that there are large-scale differences in the ways that people talk about women and men and that these differences vary across domains. Human evaluations show that our methods significantly outperform strong baselines.
Anthology ID:
D19-1579
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
5746–5752
Language:
URL:
https://aclanthology.org/D19-1579
DOI:
10.18653/v1/D19-1579
Bibkey:
Cite (ACL):
Serina Chang and Kathy McKeown. 2019. Automatically Inferring Gender Associations from Language. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5746–5752, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Automatically Inferring Gender Associations from Language (Chang & McKeown, EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-1579.pdf
Attachment:
 D19-1579.Attachment.pdf