Women’s Syntactic Resilience and Men’s Grammatical Luck: Gender-Bias in Part-of-Speech Tagging and Dependency Parsing

Aparna Garimella, Carmen Banea, Dirk Hovy, Rada Mihalcea


Abstract
Several linguistic studies have shown the prevalence of various lexical and grammatical patterns in texts authored by a person of a particular gender, but models for part-of-speech tagging and dependency parsing have still not adapted to account for these differences. To address this, we annotate the Wall Street Journal part of the Penn Treebank with the gender information of the articles’ authors, and build taggers and parsers trained on this data that show performance differences in text written by men and women. Further analyses reveal numerous part-of-speech tags and syntactic relations whose prediction performances benefit from the prevalence of a specific gender in the training data. The results underscore the importance of accounting for gendered differences in syntactic tasks, and outline future venues for developing more accurate taggers and parsers. We release our data to the research community.
Anthology ID:
P19-1339
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3493–3498
Language:
URL:
https://aclanthology.org/P19-1339
DOI:
10.18653/v1/P19-1339
Bibkey:
Cite (ACL):
Aparna Garimella, Carmen Banea, Dirk Hovy, and Rada Mihalcea. 2019. Women’s Syntactic Resilience and Men’s Grammatical Luck: Gender-Bias in Part-of-Speech Tagging and Dependency Parsing. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3493–3498, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Women’s Syntactic Resilience and Men’s Grammatical Luck: Gender-Bias in Part-of-Speech Tagging and Dependency Parsing (Garimella et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1339.pdf
Video:
 https://vimeo.com/384801759
Data
Penn Treebank