Shanthi Murugan

2024

pdf bib abs
Integration of Self-Attention Model with Intralingual Word Embedding for Contextual Semantic Analysis of Thirukkural Text
Shanthi Murugan | Kaviyarasu S | Balasundaram S R
Proceedings of the 21st International Conference on Natural Language Processing (ICON)

Thirukkural, one of the ancient works of Tamil Literature, is popular worldwide due to the moral values and practices it teaches to the society. Understanding the verses with meaning, especially context, is important. In this regard, this paper introduces a system designed to generate contextualized word meanings for the couplets of the Thirukkural, tailored to assist school children in understanding the text more effectively. Unlike traditional methods that provide detailed explanations in paragraph form, our method focuses on word-by-word interpretation, based on context through an integrated self-attention model. By combining the self-attention mechanism with FastText embeddings, our approach achieves improved performance over state-of-the-art models such as Word2Vec and standalone FastText. We evaluate the semantic understanding of the Thirukkural text using metrics as manual scoring. Tamil Thirukkural Agarathi serves as the gold-standard dataset for evaluation, demonstrating the effectiveness of our approach in capturing the nuanced semantics of the Thirukkural.

pdf bib abs
Challenges and Insights in Identifying Hate Speech and Fake News on Social Media
Shanthi Murugan | Arthi R | Boomika E | Jeyanth S | Kaviyarasu S
Proceedings of the 21st International Conference on Natural Language Processing (ICON): Shared Task on Decoding Fake Narratives in Spreading Hateful Stories (Faux-Hate)

Social media has transformed communication, but it has also brought abouta number of serious problems, most notablythe proliferation of hate speech and falseinformation. hate-related conversations arefrequently fueled by misleading narratives.We address this issue by building a multiclassclassification model trained on Faux HateMulti-Label Dataset (Biradar et al. 2024)which consists of hateful remarks that arefraudulent and have a code mix of Hindi andEnglish. Model has been built to classifySeverity (Low, Medium, High) and Target(Individual, Organization, Religion) on thedataset. Performance of the model isevaluated on test dataset achieved varyingscored for each. For Severity model achieves74%, for Target model achieves 74%. Thelimitations and performance issues of themodel has been understood and wellexplained.

Co-authors

Venues

icon2

Fix data