Verrah Otiende
2024
Mitigating Translationese in Low-resource Languages: The Storyboard Approach
Garry Kuwanto | Eno-Abasi E. Urua | Priscilla Amondi Amuok | Shamsuddeen Hassan Muhammad | Anuoluwapo Aremu | Verrah Otiende | Loice Emma Nanyanga | Teresiah W. Nyoike | Aniefon D. Akpan | Nsima Ab Udouboh | Idongesit Udeme Archibong | Idara Effiong Moses | Ifeoluwatayo A. Ige | Benjamin Ajibade | Olumide Benjamin Awokoya | Idris Abdulmumin | Saminu Mohammad Aliyu | Ruqayya Nasir Iro | Ibrahim Said Ahmad | Deontae Smith | Praise-EL Michaels | David Ifeoluwa Adelani | Derry Tanti Wijaya | Anietie Andy
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Garry Kuwanto | Eno-Abasi E. Urua | Priscilla Amondi Amuok | Shamsuddeen Hassan Muhammad | Anuoluwapo Aremu | Verrah Otiende | Loice Emma Nanyanga | Teresiah W. Nyoike | Aniefon D. Akpan | Nsima Ab Udouboh | Idongesit Udeme Archibong | Idara Effiong Moses | Ifeoluwatayo A. Ige | Benjamin Ajibade | Olumide Benjamin Awokoya | Idris Abdulmumin | Saminu Mohammad Aliyu | Ruqayya Nasir Iro | Ibrahim Said Ahmad | Deontae Smith | Praise-EL Michaels | David Ifeoluwa Adelani | Derry Tanti Wijaya | Anietie Andy
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Low-resource languages often face challenges in acquiring high-quality language data due to the reliance on translation-based methods, which can introduce the translationese effect. This phenomenon results in translated sentences that lack fluency and naturalness in the target language. In this paper, we propose a novel approach for data collection by leveraging storyboards to elicit more fluent and natural sentences. Our method involves presenting native speakers with visual stimuli in the form of storyboards and collecting their descriptions without direct exposure to the source text. We conducted a comprehensive evaluation comparing our storyboard-based approach with traditional text translation-based methods in terms of accuracy and fluency. Human annotators and quantitative metrics were used to assess translation quality. The results indicate a preference for text translation in terms of accuracy, while our method demonstrates worse accuracy but better fluency in the language focused.
2023
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages
Odunayo Ogundepo | Tajuddeen R. Gwadabe | Clara E. Rivera | Jonathan H. Clark | Sebastian Ruder | David Ifeoluwa Adelani | Bonaventure F. P. Dossou | Abdou Aziz Diop | Claytone Sikasote | Gilles Hacheme | Happy Buzaaba | Ignatius Ezeani | Rooweither Mabuya | Salomey Osei | Chris Emezue | Albert Njoroge Kahira | Shamsuddeen Hassan Muhammad | Akintunde Oladipo | Abraham Toluwase Owodunni | Atnafu Lambebo Tonja | Iyanuoluwa Shode | Akari Asai | Tunde Oluwaseyi Ajayi | Clemencia Siro | Steven Arthur | Mofetoluwa Adeyemi | Orevaoghene Ahia | Anuoluwapo Aremu | Oyinkansola Awosan | Chiamaka Chukwuneke | Bernard Opoku | Awokoya Ayodele | Verrah Otiende | Christine Mwase | Boyd Sinkala | Andre Niyongabo Rubungo | Daniel A. Ajisafe | Emeka Felix Onwuegbuzia | Habib Mbow | Emile Niyomutabazi | Eunice Mukonde | Falalu Ibrahim Lawan | Ibrahim Said Ahmad | Jesujoba O. Alabi | Martin Namukombo | Mbonu Chinedu | Mofya Phiri | Neo Putini | Ndumiso Mngoma | Priscilla A. Amouk | Ruqayya Nasir Iro | Sonia Adhiambo
Findings of the Association for Computational Linguistics: EMNLP 2023
Odunayo Ogundepo | Tajuddeen R. Gwadabe | Clara E. Rivera | Jonathan H. Clark | Sebastian Ruder | David Ifeoluwa Adelani | Bonaventure F. P. Dossou | Abdou Aziz Diop | Claytone Sikasote | Gilles Hacheme | Happy Buzaaba | Ignatius Ezeani | Rooweither Mabuya | Salomey Osei | Chris Emezue | Albert Njoroge Kahira | Shamsuddeen Hassan Muhammad | Akintunde Oladipo | Abraham Toluwase Owodunni | Atnafu Lambebo Tonja | Iyanuoluwa Shode | Akari Asai | Tunde Oluwaseyi Ajayi | Clemencia Siro | Steven Arthur | Mofetoluwa Adeyemi | Orevaoghene Ahia | Anuoluwapo Aremu | Oyinkansola Awosan | Chiamaka Chukwuneke | Bernard Opoku | Awokoya Ayodele | Verrah Otiende | Christine Mwase | Boyd Sinkala | Andre Niyongabo Rubungo | Daniel A. Ajisafe | Emeka Felix Onwuegbuzia | Habib Mbow | Emile Niyomutabazi | Eunice Mukonde | Falalu Ibrahim Lawan | Ibrahim Said Ahmad | Jesujoba O. Alabi | Martin Namukombo | Mbonu Chinedu | Mofya Phiri | Neo Putini | Ndumiso Mngoma | Priscilla A. Amouk | Ruqayya Nasir Iro | Sonia Adhiambo
Findings of the Association for Computational Linguistics: EMNLP 2023
African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems – those that retrieve answer content from other languages while serving people in their native language—offer a means of filling this gap. To this end, we create Our Dataset, the first cross-lingual QA dataset with a focus on African languages. Our Dataset includes 12,000+ XOR QA examples across 10 African languages. While previous datasets have focused primarily on languages where cross-lingual QA augments coverage from the target language, Our Dataset focuses on languages where cross-lingual answer content is the only high-coverage source of answer content. Because of this, we argue that African languages are one of the most important and realistic use cases for XOR QA. Our experiments demonstrate the poor performance of automatic translation and multilingual retrieval methods. Overall, Our Dataset proves challenging for state-of-the-art QA models. We hope that the dataset enables the development of more equitable QA technology.
2021
MasakhaNER: Named Entity Recognition for African Languages
David Ifeoluwa Adelani | Jade Abbott | Graham Neubig | Daniel D’souza | Julia Kreutzer | Constantine Lignos | Chester Palen-Michel | Happy Buzaaba | Shruti Rijhwani | Sebastian Ruder | Stephen Mayhew | Israel Abebe Azime | Shamsuddeen H. Muhammad | Chris Chinenye Emezue | Joyce Nakatumba-Nabende | Perez Ogayo | Aremu Anuoluwapo | Catherine Gitau | Derguene Mbaye | Jesujoba Alabi | Seid Muhie Yimam | Tajuddeen Rabiu Gwadabe | Ignatius Ezeani | Rubungo Andre Niyongabo | Jonathan Mukiibi | Verrah Otiende | Iroro Orife | Davis David | Samba Ngom | Tosin Adewumi | Paul Rayson | Mofetoluwa Adeyemi | Gerald Muriuki | Emmanuel Anebi | Chiamaka Chukwuneke | Nkiruka Odu | Eric Peter Wairagala | Samuel Oyerinde | Clemencia Siro | Tobius Saul Bateesa | Temilola Oloyede | Yvonne Wambui | Victor Akinode | Deborah Nabagereka | Maurice Katusiime | Ayodele Awokoya | Mouhamadane MBOUP | Dibora Gebreyohannes | Henok Tilaye | Kelechi Nwaike | Degaga Wolde | Abdoulaye Faye | Blessing Sibanda | Orevaoghene Ahia | Bonaventure F. P. Dossou | Kelechi Ogueji | Thierno Ibrahima DIOP | Abdoulaye Diallo | Adewale Akinfaderin | Tendai Marengereke | Salomey Osei
Transactions of the Association for Computational Linguistics, Volume 9
David Ifeoluwa Adelani | Jade Abbott | Graham Neubig | Daniel D’souza | Julia Kreutzer | Constantine Lignos | Chester Palen-Michel | Happy Buzaaba | Shruti Rijhwani | Sebastian Ruder | Stephen Mayhew | Israel Abebe Azime | Shamsuddeen H. Muhammad | Chris Chinenye Emezue | Joyce Nakatumba-Nabende | Perez Ogayo | Aremu Anuoluwapo | Catherine Gitau | Derguene Mbaye | Jesujoba Alabi | Seid Muhie Yimam | Tajuddeen Rabiu Gwadabe | Ignatius Ezeani | Rubungo Andre Niyongabo | Jonathan Mukiibi | Verrah Otiende | Iroro Orife | Davis David | Samba Ngom | Tosin Adewumi | Paul Rayson | Mofetoluwa Adeyemi | Gerald Muriuki | Emmanuel Anebi | Chiamaka Chukwuneke | Nkiruka Odu | Eric Peter Wairagala | Samuel Oyerinde | Clemencia Siro | Tobius Saul Bateesa | Temilola Oloyede | Yvonne Wambui | Victor Akinode | Deborah Nabagereka | Maurice Katusiime | Ayodele Awokoya | Mouhamadane MBOUP | Dibora Gebreyohannes | Henok Tilaye | Kelechi Nwaike | Degaga Wolde | Abdoulaye Faye | Blessing Sibanda | Orevaoghene Ahia | Bonaventure F. P. Dossou | Kelechi Ogueji | Thierno Ibrahima DIOP | Abdoulaye Diallo | Adewale Akinfaderin | Tendai Marengereke | Salomey Osei
Transactions of the Association for Computational Linguistics, Volume 9
We take a step towards addressing the under- representation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition (NER) in ten African languages. We detail the characteristics of these languages to help researchers and practitioners better understand the challenges they pose for NER tasks. We analyze our datasets and conduct an extensive empirical evaluation of state- of-the-art methods across both supervised and transfer learning settings. Finally, we release the data, code, and models to inspire future research on African NLP.1
Search
Fix author
Co-authors
- David Ifeoluwa Adelani 3
- Shamsuddeen Hassan Muhammad 3
- Mofetoluwa Adeyemi 2
- Orevaoghene Ahia 2
- Ibrahim Said Ahmad 2
- Jesujoba Alabi 2
- Anuoluwapo Aremu 2
- Happy Buzaaba 2
- Chiamaka Chukwuneke 2
- Bonaventure F. P. Dossou 2
- Chris Chinenye Emezue 2
- Ignatius Ezeani 2
- Ruqayya Nasir Iro 2
- Salomey Osei 2
- Sebastian Ruder 2
- Clemencia Siro 2
- Jade Abbott 1
- Idris Abdulmumin 1
- Tosin Adewumi 1
- Sonia Adhiambo 1
- Tunde Oluwaseyi Ajayi 1
- Benjamin Ajibade 1
- Daniel A. Ajisafe 1
- Adewale Akinfaderin 1
- Victor Akinode 1
- Aniefon D. Akpan 1
- Saminu Mohammad Aliyu 1
- Priscilla A. Amouk 1
- Priscilla Amondi Amuok 1
- Anietie Andy 1
- Emmanuel Anebi 1
- Aremu Anuoluwapo 1
- Idongesit Udeme Archibong 1
- Steven Arthur 1
- Akari Asai 1
- Ayodele Awokoya 1
- Olumide Benjamin Awokoya 1
- Oyinkansola Awosan 1
- Awokoya Ayodele 1
- Israel Abebe Azime 1
- Tobius Saul Bateesa 1
- Mbonu Chinedu 1
- Jonathan H. Clark 1
- Thierno Ibrahima DIOP 1
- Davis David 1
- Abdoulaye Diallo 1
- Abdou Aziz Diop 1
- Daniel D’souza 1
- Abdoulaye Faye 1
- Dibora Gebreyohannes 1
- Catherine Gitau 1
- Tajuddeen Rabiu Gwadabe 1
- Tajuddeen R. Gwadabe 1
- Gilles Hacheme 1
- Ifeoluwatayo A. Ige 1
- Albert Njoroge Kahira 1
- Maurice Katusiime 1
- Julia Kreutzer 1
- Garry Kuwanto 1
- Falalu Ibrahim Lawan 1
- Constantine Lignos 1
- Mouhamadane MBOUP 1
- Rooweither Mabuya 1
- Tendai Marengereke 1
- Stephen Mayhew 1
- Derguene Mbaye 1
- Habib Mbow 1
- Praise-EL Michaels 1
- Ndumiso Mngoma 1
- Idara Effiong Moses 1
- Jonathan Mukiibi 1
- Eunice Mukonde 1
- Gerald Muriuki 1
- Christine Mwase 1
- Deborah Nabagereka 1
- Joyce Nakatumba-Nabende 1
- Martin Namukombo 1
- Loice Emma Nanyanga 1
- Graham Neubig 1
- Samba Ngom 1
- Emile Niyomutabazi 1
- Rubungo Andre Niyongabo 1
- Kelechi Nwaike 1
- Teresiah W. Nyoike 1
- Nkiruka Odu 1
- Perez Ogayo 1
- Kelechi Ogueji 1
- Odunayo Ogundepo 1
- Akintunde Oladipo 1
- Temilola Oloyede 1
- Emeka Felix Onwuegbuzia 1
- Bernard Opoku 1
- Iroro Orife 1
- Abraham Toluwase Owodunni 1
- Samuel Oyerinde 1
- Chester Palen-Michel 1
- Mofya Phiri 1
- Neo Putini 1
- Paul Rayson 1
- Shruti Rijhwani 1
- Clara E. Rivera 1
- Andre Niyongabo Rubungo 1
- Iyanuoluwa Shode 1
- Blessing Kudzaishe Sibanda 1
- Claytone Sikasote 1
- Boyd Sinkala 1
- Deontae Smith 1
- Henok Tilaye 1
- Atnafu Lambebo Tonja 1
- Nsima Ab Udouboh 1
- Eno-Abasi E. Urua 1
- Eric Peter Wairagala 1
- Yvonne Wambui 1
- Derry Tanti Wijaya 1
- Degaga Wolde 1
- Seid Muhie Yimam 1