Automatic Gender Identification and Reinflection in Arabic

Nizar Habash, Houda Bouamor, Christine Chung


Abstract
The impressive progress in many Natural Language Processing (NLP) applications has increased the awareness of some of the biases these NLP systems have with regards to gender identities. In this paper, we propose an approach to extend biased single-output gender-blind NLP systems with gender-specific alternative reinflections. We focus on Arabic, a gender-marking morphologically rich language, in the context of machine translation (MT) from English, and for first-person-singular constructions only. Our contributions are the development of a system-independent gender-awareness wrapper, and the building of a corpus for training and evaluating first-person-singular gender identification and reinflection in Arabic. Our results successfully demonstrate the viability of this approach with 8% relative increase in Bleu score for first-person-singular feminine, and 5.3% comparable increase for first-person-singular masculine on top of a state-of-the-art gender-blind MT system on a held-out test set.
Anthology ID:
W19-3822
Original:
W19-3822v1
Version 2:
W19-3822v2
Volume:
Proceedings of the First Workshop on Gender Bias in Natural Language Processing
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Marta R. Costa-jussà, Christian Hardmeier, Will Radford, Kellie Webster
Venue:
GeBNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
155–165
Language:
URL:
https://aclanthology.org/W19-3822/
DOI:
10.18653/v1/W19-3822
Bibkey:
Cite (ACL):
Nizar Habash, Houda Bouamor, and Christine Chung. 2019. Automatic Gender Identification and Reinflection in Arabic. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pages 155–165, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Automatic Gender Identification and Reinflection in Arabic (Habash et al., GeBNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-3822.pdf
Data
OpenSubtitles