CLTL@HarmPot-ID: Leveraging Transformer Models for Detecting Offline Harm Potential and Its Targets in Low-Resource Languages

Yeshan Wang, Ilia Markov


Abstract
We present the winning approach to the TRAC 2024 Shared Task on Offline Harm Potential Identification (HarmPot-ID). The task focused on low-resource Indian languages and consisted of two sub-tasks: 1a) predicting the offline harm potential and 1b) detecting the most likely target(s) of the offline harm. We explored low-source domain specific, cross-lingual, and monolingual transformer models and submitted the aggregate predictions from the MuRIL and BERT models. Our approach achieved 0.74 micro-averaged F1-score for sub-task 1a and 0.96 for sub-task 1b, securing the 1st rank for both sub-tasks in the competition.
Anthology ID:
2024.trac-1.3
Volume:
Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Ritesh Kumar, Atul Kr. Ojha, Shervin Malmasi, Bharathi Raja Chakravarthi, Bornini Lahiri, Siddharth Singh, Shyam Ratan
Venues:
TRAC | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
21–26
Language:
URL:
https://aclanthology.org/2024.trac-1.3
DOI:
Bibkey:
Cite (ACL):
Yeshan Wang and Ilia Markov. 2024. CLTL@HarmPot-ID: Leveraging Transformer Models for Detecting Offline Harm Potential and Its Targets in Low-Resource Languages. In Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024, pages 21–26, Torino, Italia. ELRA and ICCL.
Cite (Informal):
CLTL@HarmPot-ID: Leveraging Transformer Models for Detecting Offline Harm Potential and Its Targets in Low-Resource Languages (Wang & Markov, TRAC-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.trac-1.3.pdf