Language models exhibit inherent security vulnerabilities, which may be related to several factors, among them the malicious alteration of the input data. Such weaknesses compromise the robustness of language models, which is more critical when adversarial attacks are stealthy and do not require high computational resources. In this work, we study how vulnerable English language models are to adversarial attacks based on subtle modifications of the input of pretrained English language models. We claim that the attack may be more effective if it is targeted to the most salient words for the discriminative task of the language models. Accordingly, we propose a new attack built upon a two-step approach: first, we use a posteriori explainability methods to identify the most influential words for the classification task, and second, we replace them with contextual synonyms retrieved by a small language model. Since the attack has to be as stealthy as possible, we also propose a new evaluation measure that combines the effectiveness of the attack with the number of modifications performed. The results show that pretrained English language models are vulnerable to minimal semantic changes, which makes the design of countermeasure methods imperative.
With mental health issues on the rise on the Web, especially among young people, there is a growing need for effective identification and intervention. In this paper, we introduce a new open-sourced corpus for the early detection of mental disorders in Spanish, focusing on eating disorders, depression, and anxiety. It consists of user messages posted on groups within the Telegram message platform and contains over 1,300 subjects with more than 45,000 messages posted in different public Telegram groups. This corpus has been manually annotated via crowdsourcing and is prepared for its use in several Natural Language Processing tasks including text classification and regression tasks. The samples in the corpus include both text and time data. To provide a benchmark for future research, we conduct experiments on text classification and regression by using state-of-the-art transformer-based models.
With the rise of Large Language Models (LLMs), the NLP community is increasingly aware of the environmental consequences of model development due to the energy consumed for training and running these models. This study investigates the energy consumption and environmental impact of systems participating in the MentalRiskES shared task, at the Iberian Language Evaluation Forum (IberLEF) in the year 2023, which focuses on early risk identification of mental disorders in Spanish comments. Participants were asked to submit, for each prediction, a set of efficiency metrics, being carbon dioxide emissions among them. We conduct an empirical analysis of the data submitted considering model architecture, task complexity, and dataset characteristics, covering a spectrum from traditional Machine Learning (ML) models to advanced LLMs. Our findings contribute to understanding the ecological footprint of NLP systems and advocate for prioritizing environmental impact assessment in shared tasks to foster sustainability across diverse model types and approaches, being evaluation campaigns an adequate framework for this kind of analysis.