bitsa_nlp@LT-EDI-ACL2022: Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments

Vitthal Bhandari, Poonam Goyal


Abstract
Online social networks are ubiquitous and user-friendly. Nevertheless, it is vital to detect and moderate offensive content to maintain decency and empathy. However, mining social media texts is a complex task since users don’t adhere to any fixed patterns. Comments can be written in any combination of languages and many of them may be low-resource. In this paper, we present our system for the LT-EDI shared task on detecting homophobia and transphobia in social media comments. We experiment with a number of monolingual and multilingual transformer based models such as mBERT along with a data augmentation technique for tackling class imbalance. Such pretrained large models have recently shown tremendous success on a variety of benchmark tasks in natural language processing. We observe their performance on a carefully annotated, real life dataset of YouTube comments in English as well as Tamil. Our submission achieved ranks 9, 6 and 3 with a macro-averaged F1-score of 0.42, 0.64 and 0.58 in the English, Tamil and Tamil-English subtasks respectively. The code for the system has been open sourced.
Anthology ID:
2022.ltedi-1.18
Volume:
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Bharathi Raja Chakravarthi, B Bharathi, John P McCrae, Manel Zarrouk, Kalika Bali, Paul Buitelaar
Venue:
LTEDI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
149–154
Language:
URL:
https://aclanthology.org/2022.ltedi-1.18
DOI:
10.18653/v1/2022.ltedi-1.18
Bibkey:
Cite (ACL):
Vitthal Bhandari and Poonam Goyal. 2022. bitsa_nlp@LT-EDI-ACL2022: Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, pages 149–154, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
bitsa_nlp@LT-EDI-ACL2022: Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments (Bhandari & Goyal, LTEDI 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.ltedi-1.18.pdf
Video:
 https://aclanthology.org/2022.ltedi-1.18.mp4
Code
 vitthal-bhandari/homophobia-transphobia-detection