Sudeshna Das


2024

pdf bib
Overview of the 9th Social Media Mining for Health Applications (#SMM4H) Shared Tasks at ACL 2024 – Large Language Models and Generalizability for Social Media NLP
Dongfang Xu | Guillermo Garcia | Lisa Raithel | Philippe Thomas | Roland Roller | Eiji Aramaki | Shoko Wakamiya | Shuntaro Yada | Pierre Zweigenbaum | Karen O’Connor | Sai Samineni | Sophia Hernandez | Yao Ge | Swati Rajwal | Sudeshna Das | Abeed Sarker | Ari Klein | Ana Schmidt | Vishakha Sharma | Raul Rodriguez-Esteban | Juan Banda | Ivan Amaro | Davy Weissenbacher | Graciela Gonzalez-Hernandez
Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks

For the past nine years, the Social Media Mining for Health Applications (#SMM4H) shared tasks have promoted community-driven development and evaluation of advanced natural language processing systems to detect, extract, and normalize health-related information in publicly available user-generated content. This year, #SMM4H included seven shared tasks in English, Japanese, German, French, and Spanish from Twitter, Reddit, and health forums. A total of 84 teams from 22 countries registered for #SMM4H, and 45 teams participated in at least one task. This represents a growth of 180% and 160% in registration and participation, respectively, compared to the last iteration. This paper provides an overview of the tasks and participating systems. The data sets remain available upon request, and new systems can be evaluated through the post-evaluation phase on CodaLab.

2022

pdf bib
Resilience of Named Entity Recognition Models under Adversarial Attack
Sudeshna Das | Jiaul Paik
Proceedings of the First Workshop on Dynamic Adversarial Data Collection

Named entity recognition (NER) is a popular language processing task with wide applications. Progress in NER has been noteworthy, as evidenced by the F1 scores obtained on standard datasets. In practice, however, the end-user uses an NER model on their dataset out-of-the-box, on text that may not be pristine. In this paper we present four model-agnostic adversarial attacks to gauge the resilience of NER models in such scenarios. Our experiments on four state-of-the-art NER methods with five English datasets suggest that the NER models are over-reliant on case information and do not utilise contextual information well. As such, they are highly susceptible to adversarial attacks based on these features.