Olájídé Ishola
2020
Yorùbá Dependency Treebank (YTB)
Olájídé Ishola
|
Daniel Zeman
Proceedings of the Twelfth Language Resources and Evaluation Conference
Low-resource languages present enormous NLP opportunities as well as varying degrees of difficulties. The newly released treebank of hand-annotated parts of the Yoruba Bible provides an avenue for dependency analysis of the Yoruba language; the application of a new grammar formalism to the language. In this paper, we discuss our choice of Universal Dependencies, important dependency annotation decisions considered in the creation of the first annotation guidelines for Yoruba and results of our parsing experiments. We also lay the foundation for future incorporation of other domains with the initial test on Yoruba Wikipedia articles and highlighted future directions for the rapid expansion of the treebank.