Thomas Middleton


2021

pdf bib
Gender Bias in Natural Language Processing Across Human Languages
Abigail Matthews | Isabella Grasso | Christopher Mahoney | Yan Chen | Esma Wali | Thomas Middleton | Mariama Njie | Jeanna Matthews
Proceedings of the First Workshop on Trustworthy Natural Language Processing

Natural Language Processing (NLP) systems are at the heart of many critical automated decision-making systems making crucial recommendations about our future world. Gender bias in NLP has been well studied in English, but has been less studied in other languages. In this paper, a team including speakers of 9 languages - Chinese, Spanish, English, Arabic, German, French, Farsi, Urdu, and Wolof - reports and analyzes measurements of gender bias in the Wikipedia corpora for these 9 languages. We develop extensions to profession-level and corpus-level gender bias metric calculations originally designed for English and apply them to 8 other languages, including languages that have grammatically gendered nouns including different feminine, masculine, and neuter profession words. We discuss future work that would benefit immensely from a computational linguistics perspective.