2025
pdf
bib
abs
Findings of the Shared Task on Abusive Tamil and Malayalam Text Targeting Women on Social Media: DravidianLangTech@NAACL 2025
Saranya Rajiakodi
|
Bharathi Raja Chakravarthi
|
Shunmuga Priya Muthusamy Chinnan
|
Ruba Priyadharshini
|
Raja Meenakshi J
|
Kathiravan Pannerselvam
|
Rahul Ponnusamy
|
Bhuvaneswari Sivagnanam
|
Paul Buitelaar
|
Bhavanimeena K
|
Jananayagan Jananayagan
|
Kishore Kumar Ponnusamy
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
This overview paper presents the findings of the Shared Task on Abusive Tamil and Malayalam Text Targeting Women on Social Media, organized as part of DravidianLangTech@NAACL 2025. The task aimed to encourage the development of robust systems to detectabusive content targeting women in Tamil and Malayalam, two low-resource Dravidian languages. Participants were provided with annotated datasets containing abusive and nonabusive text curated from YouTube comments. We present an overview of the approaches and analyse the results of the shared task submissions. We believe the findings presented in this paper will be useful to researchers working in Dravidian language technology.
pdf
bib
abs
Findings of the Shared Task on Misogyny Meme Detection: DravidianLangTech@NAACL 2025
Bharathi Raja Chakravarthi
|
Rahul Ponnusamy
|
Saranya Rajiakodi
|
Shunmuga Priya Muthusamy Chinnan
|
Paul Buitelaar
|
Bhuvaneswari Sivagnanam
|
Anshid K A
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
The rapid expansion of social media has facilitated communication but also enabled the spread of misogynistic memes, reinforcing gender stereotypes and toxic online environments. Detecting such content is challenging due to the multimodal nature of memes, where meaning emerges from the interplay of text and images. The Misogyny Meme Detection shared task at DravidianLangTech@NAACL 2025 focused on Tamil and Malayalam, encouraging the development of multimodal approaches. With 114 teams registered and 23 submitting predictions, participants leveraged various pretrained language models and vision models through fusion techniques. The best models achieved high macro F1 scores (0.83682 for Tamil, 0.87631 for Malayalam), highlighting the effectiveness of multimodal learning. Despite these advances, challenges such as bias in the data set, class imbalance, and cultural variations persist. Future research should refine multimodal detection methods to improve accuracy and adaptability, fostering safer and more inclusive online spaces.
pdf
bib
abs
Findings of the Shared Task Multilingual Bias and Propaganda Annotation in Political Discourse
Shunmuga Priya Muthusamy Chinnan
|
Bharathi Raja Chakravarthi
|
Meghann Drury-Grogan
|
Senthil Kumar B
|
Saranya Rajiakodi
|
Angel Deborah S
Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion
The Multilingual Bias and Propaganda Annotation task focuses on annotating biased and propagandist content in political discourse across English and Tamil. This paper presents the findings of the shared task on bias and propaganda annotation task. This task involves two sub tasks, one in English and another in Tamil, both of which are annotation task where a text comment is to be labeled. With a particular emphasis on polarizing policy debates such as the US Gender Policy and India’s Three Language Policy, this shared task invites participants to build annotation systems capable of labeling textual bias and propaganda. The dataset was curated by collecting comments from YouTube videos. Our curated dataset consists of 13,010 English sentences on US Gender Policy, Russia-Ukraine War and 5,880 Tamil sentences on Three Language Policy. Participants were instructed to annotate following the guidelines at sentence level with the bias labels that are fine-grained, domain specific and 4 propaganda labels. Participants were encouraged to leverage existing tools or develop novel approaches to perform fine-grained annotations that capture the complex socio-political nuances present in the data.
pdf
bib
abs
Findings of the Shared Task Caste and Migration Hate Speech Detection
Saranya Rajiakodi
|
Bharathi Raja Chakravarthi
|
Rahul Ponnusamy
|
Shunmuga Priya Muthusamy Chinnan
|
Prasanna Kumar Kumaresan
|
Sathiyaraj Thangasamy
|
Bhuvaneswari Sivagnanam
|
Balasubramanian Palani
|
Kogilavani Shanmugavadivel
|
Abirami Murugappan
|
Charmathi Rajkumar
Proceedings of the 5th Conference on Language, Data and Knowledge: Fifth Workshop on Language Technology for Equality, Diversity, Inclusion
Hate speech targeting caste and migration communities is a growing concern in online platforms, particularly in linguistically diverse regions. By focusing on Tamil language text content, this task provides a unique opportunity to tackle caste or migration related hate speech detection in a low resource language Tamil, contributing to a safer digital space. We present the results and main findings of the shared task caste and migration hate speech detection. The task is a binary classification determining whether a text is caste/migration related hate speech or not. The task attracted 17 participating teams, experimenting with a wide range of methodologies from traditional machine learning to advanced multilingual transformers. The top performing system achieved a macro F1-score of 0.88105, enhancing an ensemble of fine-tuned transformer models including XLM-R and MuRIL. Our analysis highlights the effectiveness of multilingual transformers in low resource, ensemble learning, and culturally informed socio political context based techniques.