Semi-automatic discourse annotation in a low-resource language: Developing a connective lexicon for Nigerian Pidgin

Marian Marchal, Merel Scholman, Vera Demberg


Abstract
Cross-linguistic research on discourse structure and coherence marking requires discourse-annotated corpora and connective lexicons in a large number of languages. However, the availability of such resources is limited, especially for languages for which linguistic resources are scarce in general, such as Nigerian Pidgin. In this study, we demonstrate how a semi-automatic approach can be used to source connectives and their relation senses and develop a discourse-annotated corpus in a low-resource language. Connectives and their relation senses were extracted from a parallel corpus combining automatic (PDTB end-to-end parser) and manual annotations. This resulted in Naija-Lex, a lexicon of discourse connectives in Nigerian Pidgin with English translations. The lexicon shows that the majority of Nigerian Pidgin connectives are borrowed from its English lexifier, but that there are also some connectives that are unique to Nigerian Pidgin.
Anthology ID:
2021.codi-main.8
Volume:
Proceedings of the 2nd Workshop on Computational Approaches to Discourse
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic and Online
Editors:
Chloé Braud, Christian Hardmeier, Junyi Jessy Li, Annie Louis, Michael Strube, Amir Zeldes
Venue:
CODI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
84–94
Language:
URL:
https://aclanthology.org/2021.codi-main.8
DOI:
10.18653/v1/2021.codi-main.8
Bibkey:
Cite (ACL):
Marian Marchal, Merel Scholman, and Vera Demberg. 2021. Semi-automatic discourse annotation in a low-resource language: Developing a connective lexicon for Nigerian Pidgin. In Proceedings of the 2nd Workshop on Computational Approaches to Discourse, pages 84–94, Punta Cana, Dominican Republic and Online. Association for Computational Linguistics.
Cite (Informal):
Semi-automatic discourse annotation in a low-resource language: Developing a connective lexicon for Nigerian Pidgin (Marchal et al., CODI 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.codi-main.8.pdf
Video:
 https://aclanthology.org/2021.codi-main.8.mp4