Inferring Gender: A Scalable Methodology for Gender Detection with Online Lexical Databases

Marion Bartl, Susan Leavy


Abstract
This paper presents a new method for automatic detection of gendered terms in large-scale language datasets. Currently, the evaluation of gender bias in natural language processing relies on the use of manually compiled lexicons of gendered expressions, such as pronouns and words that imply gender. However, manual compilation of lists with lexical gender can lead to static information if lists are not periodically updated and often involve value judgements by individual annotators and researchers. Moreover, terms not included in the lexicons fall out of the range of analysis. To address these issues, we devised a scalable dictionary-based method to automatically detect lexical gender that can provide a dynamic, up-to-date analysis with high coverage. Our approach reaches over 80% accuracy in determining the lexical gender of words retrieved randomly from a Wikipedia sample and when testing on a list of gendered words used in previous research.
Anthology ID:
2022.ltedi-1.7
Volume:
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Bharathi Raja Chakravarthi, B Bharathi, John P McCrae, Manel Zarrouk, Kalika Bali, Paul Buitelaar
Venue:
LTEDI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
47–58
Language:
URL:
https://aclanthology.org/2022.ltedi-1.7
DOI:
10.18653/v1/2022.ltedi-1.7
Bibkey:
Cite (ACL):
Marion Bartl and Susan Leavy. 2022. Inferring Gender: A Scalable Methodology for Gender Detection with Online Lexical Databases. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, pages 47–58, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Inferring Gender: A Scalable Methodology for Gender Detection with Online Lexical Databases (Bartl & Leavy, LTEDI 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.ltedi-1.7.pdf
Video:
 https://aclanthology.org/2022.ltedi-1.7.mp4
Code
 marionbartl/lexical-gender