MUCS@LT-EDI-EACL2021:CoHope-Hope Speech Detection for Equality, Diversity, and Inclusion in Code-Mixed Texts

Fazlourrahman Balouchzahi, Aparna B K, H L Shashirekha


Abstract
This paper describes the models submitted by the team MUCS for “Hope Speech Detection for Equality, Diversity, and Inclusion-EACL 2021” shared task that aims at classifying a comment / post in English and code-mixed texts in two language pairs, namely, Tamil-English (Ta-En) and Malayalam-English (Ma-En) into one of the three predefined categories, namely, “Hope_speech”, “Non_hope_speech”, and “other_languages”. Three models namely, CoHope-ML, CoHope-NN, and CoHope-TL based on Ensemble of classifiers, Keras Neural Network (NN) and BiLSTM with Conv1d model respectively are proposed for the shared task. CoHope-ML, CoHope-NN models are trained on a feature set comprised of char sequences extracted from sentences combined with words for Ma-En and Ta-En code-mixed texts and a combination of word and char ngrams along with syntactic word ngrams for English text. CoHope-TL model consists of three major parts: training tokenizer, BERT Language Model (LM) training and then using pre-trained BERT LM as weights in BiLSTM-Conv1d model. Out of three proposed models, CoHope-ML model (best among our models) obtained 1st, 2nd, and 3rd ranks with weighted F1-scores of 0.85, 0.92, and 0.59 for Ma-En, English and Ta-En texts respectively.
Anthology ID:
2021.ltedi-1.27
Volume:
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion
Month:
April
Year:
2021
Address:
Kyiv
Editors:
Bharathi Raja Chakravarthi, John P. McCrae, Manel Zarrouk, Kalika Bali, Paul Buitelaar
Venue:
LTEDI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
180–187
Language:
URL:
https://aclanthology.org/2021.ltedi-1.27
DOI:
Bibkey:
Cite (ACL):
Fazlourrahman Balouchzahi, Aparna B K, and H L Shashirekha. 2021. MUCS@LT-EDI-EACL2021:CoHope-Hope Speech Detection for Equality, Diversity, and Inclusion in Code-Mixed Texts. In Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion, pages 180–187, Kyiv. Association for Computational Linguistics.
Cite (Informal):
MUCS@LT-EDI-EACL2021:CoHope-Hope Speech Detection for Equality, Diversity, and Inclusion in Code-Mixed Texts (Balouchzahi et al., LTEDI 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ltedi-1.27.pdf
Software:
 2021.ltedi-1.27.Software.zip
Data
Dakshina