Emile Niyomutabazi
2023
MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African languages
Cheikh M. Bamba Dione | David Ifeoluwa Adelani | Peter Nabende | Jesujoba Alabi | Thapelo Sindane | Happy Buzaaba | Shamsuddeen Hassan Muhammad | Chris Chinenye Emezue | Perez Ogayo | Anuoluwapo Aremu | Catherine Gitau | Derguene Mbaye | Jonathan Mukiibi | Blessing Sibanda | Bonaventure F. P. Dossou | Andiswa Bukula | Rooweither Mabuya | Allahsera Auguste Tapo | Edwin Munkoh-Buabeng | Victoire Memdjokam Koagne | Fatoumata Ouoba Kabore | Amelia Taylor | Godson Kalipe | Tebogo Macucwa | Vukosi Marivate | Tajuddeen Gwadabe | Mboning Tchiaze Elvis | Ikechukwu Onyenwe | Gratien Atindogbe | Tolulope Adelani | Idris Akinade | Olanrewaju Samuel | Marien Nahimana | Théogène Musabeyezu | Emile Niyomutabazi | Ester Chimhenga | Kudzai Gotosa | Patrick Mizha | Apelete Agbolo | Seydou Traore | Chinedu Uchechukwu | Aliyu Yusuf | Muhammad Abdullahi | Dietrich Klakow
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Cheikh M. Bamba Dione | David Ifeoluwa Adelani | Peter Nabende | Jesujoba Alabi | Thapelo Sindane | Happy Buzaaba | Shamsuddeen Hassan Muhammad | Chris Chinenye Emezue | Perez Ogayo | Anuoluwapo Aremu | Catherine Gitau | Derguene Mbaye | Jonathan Mukiibi | Blessing Sibanda | Bonaventure F. P. Dossou | Andiswa Bukula | Rooweither Mabuya | Allahsera Auguste Tapo | Edwin Munkoh-Buabeng | Victoire Memdjokam Koagne | Fatoumata Ouoba Kabore | Amelia Taylor | Godson Kalipe | Tebogo Macucwa | Vukosi Marivate | Tajuddeen Gwadabe | Mboning Tchiaze Elvis | Ikechukwu Onyenwe | Gratien Atindogbe | Tolulope Adelani | Idris Akinade | Olanrewaju Samuel | Marien Nahimana | Théogène Musabeyezu | Emile Niyomutabazi | Ester Chimhenga | Kudzai Gotosa | Patrick Mizha | Apelete Agbolo | Seydou Traore | Chinedu Uchechukwu | Aliyu Yusuf | Muhammad Abdullahi | Dietrich Klakow
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In this paper, we present AfricaPOS, the largest part-of-speech (POS) dataset for 20 typologically diverse African languages. We discuss the challenges in annotating POS for these languages using the universal dependencies (UD) guidelines. We conducted extensive POS baseline experiments using both conditional random field and several multilingual pre-trained language models. We applied various cross-lingual transfer models trained with data available in the UD. Evaluating on the AfricaPOS dataset, we show that choosing the best transfer language(s) in both single-source and multi-source setups greatly improves the POS tagging performance of the target languages, in particular when combined with parameter-fine-tuning methods. Crucially, transferring knowledge from a language that matches the language family and morphosyntactic properties seems to be more effective for POS tagging in unseen languages.
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages
Odunayo Ogundepo | Tajuddeen R. Gwadabe | Clara E. Rivera | Jonathan H. Clark | Sebastian Ruder | David Ifeoluwa Adelani | Bonaventure F. P. Dossou | Abdou Aziz Diop | Claytone Sikasote | Gilles Hacheme | Happy Buzaaba | Ignatius Ezeani | Rooweither Mabuya | Salomey Osei | Chris Emezue | Albert Njoroge Kahira | Shamsuddeen Hassan Muhammad | Akintunde Oladipo | Abraham Toluwase Owodunni | Atnafu Lambebo Tonja | Iyanuoluwa Shode | Akari Asai | Tunde Oluwaseyi Ajayi | Clemencia Siro | Steven Arthur | Mofetoluwa Adeyemi | Orevaoghene Ahia | Anuoluwapo Aremu | Oyinkansola Awosan | Chiamaka Chukwuneke | Bernard Opoku | Awokoya Ayodele | Verrah Otiende | Christine Mwase | Boyd Sinkala | Andre Niyongabo Rubungo | Daniel A. Ajisafe | Emeka Felix Onwuegbuzia | Habib Mbow | Emile Niyomutabazi | Eunice Mukonde | Falalu Ibrahim Lawan | Ibrahim Said Ahmad | Jesujoba O. Alabi | Martin Namukombo | Mbonu Chinedu | Mofya Phiri | Neo Putini | Ndumiso Mngoma | Priscilla A. Amouk | Ruqayya Nasir Iro | Sonia Adhiambo
Findings of the Association for Computational Linguistics: EMNLP 2023
Odunayo Ogundepo | Tajuddeen R. Gwadabe | Clara E. Rivera | Jonathan H. Clark | Sebastian Ruder | David Ifeoluwa Adelani | Bonaventure F. P. Dossou | Abdou Aziz Diop | Claytone Sikasote | Gilles Hacheme | Happy Buzaaba | Ignatius Ezeani | Rooweither Mabuya | Salomey Osei | Chris Emezue | Albert Njoroge Kahira | Shamsuddeen Hassan Muhammad | Akintunde Oladipo | Abraham Toluwase Owodunni | Atnafu Lambebo Tonja | Iyanuoluwa Shode | Akari Asai | Tunde Oluwaseyi Ajayi | Clemencia Siro | Steven Arthur | Mofetoluwa Adeyemi | Orevaoghene Ahia | Anuoluwapo Aremu | Oyinkansola Awosan | Chiamaka Chukwuneke | Bernard Opoku | Awokoya Ayodele | Verrah Otiende | Christine Mwase | Boyd Sinkala | Andre Niyongabo Rubungo | Daniel A. Ajisafe | Emeka Felix Onwuegbuzia | Habib Mbow | Emile Niyomutabazi | Eunice Mukonde | Falalu Ibrahim Lawan | Ibrahim Said Ahmad | Jesujoba O. Alabi | Martin Namukombo | Mbonu Chinedu | Mofya Phiri | Neo Putini | Ndumiso Mngoma | Priscilla A. Amouk | Ruqayya Nasir Iro | Sonia Adhiambo
Findings of the Association for Computational Linguistics: EMNLP 2023
African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems – those that retrieve answer content from other languages while serving people in their native language—offer a means of filling this gap. To this end, we create Our Dataset, the first cross-lingual QA dataset with a focus on African languages. Our Dataset includes 12,000+ XOR QA examples across 10 African languages. While previous datasets have focused primarily on languages where cross-lingual QA augments coverage from the target language, Our Dataset focuses on languages where cross-lingual answer content is the only high-coverage source of answer content. Because of this, we argue that African languages are one of the most important and realistic use cases for XOR QA. Our experiments demonstrate the poor performance of automatic translation and multilingual retrieval methods. Overall, Our Dataset proves challenging for state-of-the-art QA models. We hope that the dataset enables the development of more equitable QA technology.
Search
Fix author
Co-authors
- David Ifeoluwa Adelani 2
- Jesujoba Alabi 2
- Anuoluwapo Aremu 2
- Happy Buzaaba 2
- Bonaventure F. P. Dossou 2
- Chris Chinenye Emezue 2
- Rooweither Mabuya 2
- Shamsuddeen Hassan Muhammad 2
- Muhammad Abdullahi 1
- Tolulope Adelani 1
- Mofetoluwa Adeyemi 1
- Sonia Adhiambo 1
- Apelete Agbolo 1
- Orevaoghene Ahia 1
- Ibrahim Said Ahmad 1
- Tunde Oluwaseyi Ajayi 1
- Daniel A. Ajisafe 1
- Idris Akinade 1
- Priscilla A. Amouk 1
- Steven Arthur 1
- Akari Asai 1
- Gratien Atindogbe 1
- Oyinkansola Awosan 1
- Awokoya Ayodele 1
- Andiswa Bukula 1
- Ester Chimhenga 1
- Mbonu Chinedu 1
- Chiamaka Chukwuneke 1
- Jonathan H. Clark 1
- Cheikh M. Bamba Dione 1
- Abdou Aziz Diop 1
- Mboning Tchiaze Elvis 1
- Ignatius Ezeani 1
- Catherine Gitau 1
- Kudzai Gotosa 1
- Tajuddeen Gwadabe 1
- Tajuddeen R. Gwadabe 1
- Gilles Hacheme 1
- Ruqayya Nasir Iro 1
- Albert Njoroge Kahira 1
- Godson Kalipe 1
- Dietrich Klakow 1
- Falalu Ibrahim Lawan 1
- Tebogo Macucwa 1
- Vukosi Marivate 1
- Derguene Mbaye 1
- Habib Mbow 1
- Victoire Memdjokam Koagne 1
- Patrick Mizha 1
- Ndumiso Mngoma 1
- Jonathan Mukiibi 1
- Eunice Mukonde 1
- Edwin Munkoh-Buabeng 1
- Théogène Musabeyezu 1
- Christine Mwase 1
- Peter Nabende 1
- Marien Nahimana 1
- Martin Namukombo 1
- Perez Ogayo 1
- Odunayo Ogundepo 1
- Akintunde Oladipo 1
- Emeka Felix Onwuegbuzia 1
- Ikechukwu Onyenwe 1
- Bernard Opoku 1
- Salomey Osei 1
- Verrah Otiende 1
- Fatoumata Ouoba Kabore 1
- Abraham Toluwase Owodunni 1
- Mofya Phiri 1
- Neo Putini 1
- Clara E. Rivera 1
- Andre Niyongabo Rubungo 1
- Sebastian Ruder 1
- Olanrewaju Samuel 1
- Iyanuoluwa Shode 1
- Blessing Kudzaishe Sibanda 1
- Claytone Sikasote 1
- Thapelo Sindane 1
- Boyd Sinkala 1
- Clemencia Siro 1
- Allahsera Auguste Tapo 1
- Amelia Taylor 1
- Atnafu Lambebo Tonja 1
- Seydou Traore 1
- Chinedu Uchechukwu 1
- Aliyu Yusuf 1