Gashaw Gebremeskel


2024

pdf bib
Gender Bias Evaluation in Machine Translation for Amharic, Tigrigna, and Afaan Oromoo
Walelign Sewunetie | Atnafu Tonja | Tadesse Belay | Hellina Hailu Nigatu | Gashaw Gebremeskel | Zewdie Mossie | Hussien Seid | Seid Yimam
Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies

While Machine Translation (MT) research has progressed over the years, translation systems still suffer from biases, including gender bias. While an active line of research studies the existence and mitigation strategies of gender bias in machine translation systems, there is limited research exploring this phenomenon for low-resource languages. The limited availability of linguistic and computational resources confounded with the lack of benchmark datasets makes studying bias for low-resourced languages that much more difficult. In this paper, we construct benchmark datasets to evaluate gender bias in machine translation for three low-resource languages: Afaan Oromoo (Orm), Amharic (Amh), and Tigrinya (Tir). Building on prior work, we collected 2400 gender-balanced sentences parallelly translated into the three languages. From human evaluations of the dataset we collected, we found that about 93% of Afaan Oromoo, 80% of Tigrinya, and 72% of Amharic sentences exhibited gender bias. In addition to providing benchmarks for improving gender bias mitigation research in the three languages, we hope the careful documentation of our work will help other low-resourced language researchers extend our approach to their languages.