Simbiat Ajao
2026
Power Asymmetries, Bias, and AI, a Reflection of Society on Low-Resourced Languages - African Languages as Case Study
Simbiat Ajao
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
Simbiat Ajao
Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
In recent times, artificial intelligence (AI) systems have become the primary intermediary to information access, services, and opportunities. Currently, there are growing concerns as to how existing social inequalities are reproduced and amplified through AI. This is significantly evident in language technologies, where a small number of dominant languages or what we’ll refer to as big languages and cultural contexts shape the training, design, and evaluation of models. This paper examines the intersections of power asymmetries, linguistic bias, and cultural representation in AI, with a major focus on African languages and communities. We argue that current Natural Language Processing (NLP) systems reflect a high level of global imbalances in the availability of data, infrastructure, and decision making power, often marginalizing low-resourced languages and cultural peculiarities. It is important we know that how these data are structured is a great determinant in what their outcome will be. With reference to examples from speech recognition, machine translation, and large language models, we highlight the social and cultural consequences of linguistic exclusion, including reduced accessibility, misinterpretation, and digital invisibility. Finally, we identify and discuss pathways toward more equitable language technologies, emphasizing community-led data practices, interdisciplinary collaboration, and context-aware evaluation frameworks. By foregrounding language as both a technical and political concern, this work advocates for African-centered approaches to NLP that promote fairness, accountability, and linguistic justice in AI development.
2024
AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages
Jiayi Wang | David Ifeoluwa Adelani | Sweta Agrawal | Marek Masiak | Ricardo Rei | Eleftheria Briakou | Marine Carpuat | Xuanli He | Sofia Bourhim | Andiswa Bukula | Muhidin Mohamed | Temitayo Olatoye | Tosin Adewumi | Hamam Mokayed | Christine Mwase | Wangui Kimotho | Foutse Yuehgoh | Anuoluwapo Aremu | Jessica Ojo | Shamsuddeen Hassan Muhammad | Salomey Osei | Abdul-Hakeem Omotayo | Chiamaka Chukwuneke | Perez Ogayo | Oumaima Hourrane | Salma El Anigri | Lolwethu Ndolela | Thabiso Mangwana | Shafie Abdi Mohamed | Ayinde Hassan | Oluwabusayo Olufunke Awoyomi | Lama Alkhaled | Sana Al-Azzawi | Naome A. Etori | Millicent Ochieng | Clemencia Siro | Samuel Njoroge | Eric Muchiri | Wangari Kimotho | Lyse Naomi Wamba Momo | Daud Abolade | Simbiat Ajao | Iyanuoluwa Shode | Ricky Macharm | Ruqayya Nasir Iro | Saheed S. Abdullahi | Stephen E. Moore | Bernard Opoku | Zainab Akinjobi | Abeeb Afolabi | Nnaemeka Obiefuna | Onyekachi Raphael Ogbu | Sam Brian | Verrah Akinyi Otiende | Chinedu Emmanuel Mbonu | Sakayo Toadoum Sari | Yao Lu | Pontus Stenetorp
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Jiayi Wang | David Ifeoluwa Adelani | Sweta Agrawal | Marek Masiak | Ricardo Rei | Eleftheria Briakou | Marine Carpuat | Xuanli He | Sofia Bourhim | Andiswa Bukula | Muhidin Mohamed | Temitayo Olatoye | Tosin Adewumi | Hamam Mokayed | Christine Mwase | Wangui Kimotho | Foutse Yuehgoh | Anuoluwapo Aremu | Jessica Ojo | Shamsuddeen Hassan Muhammad | Salomey Osei | Abdul-Hakeem Omotayo | Chiamaka Chukwuneke | Perez Ogayo | Oumaima Hourrane | Salma El Anigri | Lolwethu Ndolela | Thabiso Mangwana | Shafie Abdi Mohamed | Ayinde Hassan | Oluwabusayo Olufunke Awoyomi | Lama Alkhaled | Sana Al-Azzawi | Naome A. Etori | Millicent Ochieng | Clemencia Siro | Samuel Njoroge | Eric Muchiri | Wangari Kimotho | Lyse Naomi Wamba Momo | Daud Abolade | Simbiat Ajao | Iyanuoluwa Shode | Ricky Macharm | Ruqayya Nasir Iro | Saheed S. Abdullahi | Stephen E. Moore | Bernard Opoku | Zainab Akinjobi | Abeeb Afolabi | Nnaemeka Obiefuna | Onyekachi Raphael Ogbu | Sam Brian | Verrah Akinyi Otiende | Chinedu Emmanuel Mbonu | Sakayo Toadoum Sari | Yao Lu | Pontus Stenetorp
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measuring this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLM-R) to create the state-of-the-art MT evaluation metrics for African languages with respect to Spearman-rank correlation with human judgments (0.441).
Search
Fix author
Co-authors
- Saheed S. Abdullahi 1
- Daud Abolade 1
- David Ifeoluwa Adelani 1
- Tosin Adewumi 1
- Abeeb Afolabi 1
- Sweta Agrawal 1
- Zainab Akinjobi 1
- Sana Al-Azzawi 1
- Lama Alkhaled 1
- Anuoluwapo Aremu 1
- Oluwabusayo Olufunke Awoyomi 1
- Sofia Bourhim 1
- Eleftheria Briakou 1
- Sam Brian 1
- Andiswa Bukula 1
- Marine Carpuat 1
- Chiamaka Chukwuneke 1
- Salma El Anigri 1
- Naome A. Etori 1
- Ayinde Hassan 1
- Xuanli He 1
- Oumaima Hourrane 1
- Ruqayya Nasir Iro 1
- Wangui Kimotho 1
- Wangari Kimotho 1
- Yao Lu 1
- Ricky Macharm 1
- Thabiso Mangwana 1
- Marek Masiak 1
- Chinedu Emmanuel Mbonu 1
- Muhidin Mohamed 1
- Shafie Abdi Mohamed 1
- Hamam Mokayed 1
- Stephen E. Moore 1
- Eric Muchiri 1
- Shamsuddeen Hassan Muhammad 1
- Christine Mwase 1
- Lolwethu Ndolela 1
- Samuel Njoroge 1
- Nnaemeka Obiefuna 1
- Millicent Ochieng 1
- Perez Ogayo 1
- Onyekachi Raphael Ogbu 1
- Jessica Ojo 1
- Temitayo Olatoye 1
- Abdul-Hakeem Omotayo 1
- Bernard Opoku 1
- Salomey Osei 1
- Verrah Akinyi Otiende 1
- Ricardo Rei 1
- Iyanuoluwa Shode 1
- Clemencia Siro 1
- Pontus Stenetorp 1
- Sakayo Toadoum Sari 1
- Lyse Naomi Wamba Momo 1
- Jiayi Wang 1
- Foutse Yuehgoh 1