TEAM HUB@LT-EDI-EACL2021: Hope Speech Detection Based On Pre-trained Language Model

Bo Huang, Yang Bai


Abstract
This article introduces the system description of TEAM_HUB team participating in LT-EDI 2021: Hope Speech Detection. This shared task is the first task related to the desired voice detection. The data set in the shared task consists of three different languages (English, Tamil, and Malayalam). The task type is text classification. Based on the analysis and understanding of the task description and data set, we designed a system based on a pre-trained language model to complete this shared task. In this system, we use methods and models that combine the XLM-RoBERTa pre-trained language model and the Tf-Idf algorithm. In the final result ranking announced by the task organizer, our system obtained F1 scores of 0.93, 0.84, 0.59 on the English dataset, Malayalam dataset, and Tamil dataset. Our submission results are ranked 1, 2, and 3 respectively.
Anthology ID:
2021.ltedi-1.17
Volume:
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion
Month:
April
Year:
2021
Address:
Kyiv
Venues:
EACL | LTEDI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
122–127
Language:
URL:
https://aclanthology.org/2021.ltedi-1.17
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.ltedi-1.17.pdf
Software:
 2021.ltedi-1.17.Software.zip