Cross-lingual Transfer and Multilingual Learning for Detecting Harmful Behaviour in African Under-Resourced Language Dialogue

Tunde Oluwaseyi Ajayi, Mihael Arcan, Paul Buitelaar


Abstract
Most harmful dialogue detection models are developed for high-resourced languages. Consequently, users who speak under-resourced languages cannot fully benefit from these models in terms of usage, development, detection and mitigation of harmful dialogue utterances. Our work aims at detecting harmful utterances in under-resourced African languages. We leverage transfer learning using pretrained models trained with multilingual embeddings to develop a cross-lingual model capable of detecting harmful content across various African languages. We first fine-tune a harmful dialogue detection model on a selected African dialogue dataset. Additionally, we fine-tune a model on a combined dataset in some African languages to develop a multilingual harmful dialogue detection model. We then evaluate the cross-lingual model’s ability to generalise to an unseen African language by performing harmful dialogue detection in an under-resourced language not present during pretraining or fine-tuning. We evaluate our models on the test datasets. We show that our best performing models achieve impressive results in terms of F1 score. Finally, we discuss the results and limitations of our work.
Anthology ID:
2024.sigdial-1.49
Volume:
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:
September
Year:
2024
Address:
Kyoto, Japan
Editors:
Tatsuya Kawahara, Vera Demberg, Stefan Ultes, Koji Inoue, Shikib Mehri, David Howcroft, Kazunori Komatani
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
579–589
Language:
URL:
https://aclanthology.org/2024.sigdial-1.49
DOI:
10.18653/v1/2024.sigdial-1.49
Bibkey:
Cite (ACL):
Tunde Oluwaseyi Ajayi, Mihael Arcan, and Paul Buitelaar. 2024. Cross-lingual Transfer and Multilingual Learning for Detecting Harmful Behaviour in African Under-Resourced Language Dialogue. In Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 579–589, Kyoto, Japan. Association for Computational Linguistics.
Cite (Informal):
Cross-lingual Transfer and Multilingual Learning for Detecting Harmful Behaviour in African Under-Resourced Language Dialogue (Ajayi et al., SIGDIAL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.sigdial-1.49.pdf