Xiaofei Li

2024

Multilingual Bot Accusations: How Different Linguistic Contexts Shape Perceptions of Social Bots
Leon Fröhling | Xiaofei Li | Dennis Assenmacher
Proceedings of the 4th Workshop on Computational Linguistics for the Political and Social Sciences: Long and short papers

Recent research indicates that the online use of the term ”bot” has evolved over time. In the past, people used the term to accuse others of displaying automated behavior. However, it has gradually transformed into a linguistic tool to dehumanize the conversation partner, particularly on polarizing topics. Although this trend has been observed in English-speaking contexts, it is still unclear whether it holds true in other socio-linguistic environments. In this work we extend existing work on bot accusations and explore the phenomenon in a multilingual setting. We identify three distinct accusation patterns that characterize the different languages.

2023

pdf bib abs

What to Fuse and How to Fuse: Exploring Emotion and Personality Fusion Strategies for Explainable Mental Disorder Detection
Sourabh Zanwar | Xiaofei Li | Daniel Wiechmann | Yu Qiao | Elma Kerz
Findings of the Association for Computational Linguistics: ACL 2023

Mental health disorders (MHD) are increasingly prevalent worldwide and constitute one of the greatest challenges facing our healthcare systems and modern societies in general. In response to this societal challenge, there has been a surge in digital mental health research geared towards the development of new techniques for unobtrusive and efficient automatic detection of MHD. Within this area of research, natural language processing techniques are playing an increasingly important role, showing promising detection results from a variety of textual data. Recently, there has been a growing interest in improving mental illness detection from textual data by way of leveraging emotions: ‘Emotion fusion’ refers to the process of integrating emotion information with general textual information to obtain enhanced information for decision-making. However, while the available research has shown that MHD prediction can be improved through a variety of different fusion strategies, previous works have been confined to a particular fusion strategy applied to a specific dataset, and so is limited by the lack of meaningful comparability. In this work, we integrate and extend this research by conducting extensive experiments with three types of deep learning-based fusion strategies: (i) feature-level fusion, where a pre-trained masked language model for mental health detection (MentalRoBERTa) was infused with a comprehensive set of engineered features, (ii) model fusion, where the MentalRoBERTa model was infused with hidden representations of other language models and (iii) task fusion, where a multi-task framework was leveraged to learn the features for auxiliary tasks. In addition to exploring the role of different fusion strategies, we expand on previous work by broadening the information infusion to include a second domain related to mental health, namely personality. We evaluate algorithm performance on data from two benchmark datasets, encompassing five mental health conditions: attention deficit hyperactivity disorder, anxiety, bipolar disorder, depression and psychological stress.

2022

pdf bib abs

(Psycho-)Linguistic Features Meet Transformer Models for Improved Explainable and Controllable Text Simplification
Yu Qiao | Xiaofei Li | Daniel Wiechmann | Elma Kerz
Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)

State-of-the-art text simplification (TS) systems adopt end-to-end neural network models to directly generate the simplified version of the input text, and usually function as a blackbox. Moreover, TS is usually treated as an all-purpose generic task under the assumption of homogeneity, where the same simplification is suitable for all. In recent years, however, there has been increasing recognition of the need to adapt the simplification techniques to the specific needs of different target groups. In this work, we aim to advance current research on explainable and controllable TS in two ways: First, building on recently proposed work to increase the transparency of TS systems (Garbacea et al., 2020), we use a large set of (psycho-)linguistic features in combination with pre-trained language models to improve explainable complexity prediction. Second, based on the results of this preliminary task, we extend a state-of-the-art Seq2Seq TS model, ACCESS (Martin et al., 2020), to enable explicit control of ten attributes. The results of experiments show (1) that our approach improves the performance of state-of-the-art models for predicting explainable complexity and (2) that explicitly conditioning the Seq2Seq model on ten attributes leads to a significant improvement in performance in both within-domain and out-of-domain settings.

pdf bib abs

MANTIS at TSAR-2022 Shared Task: Improved Unsupervised Lexical Simplification with Pretrained Encoders
Xiaofei Li | Daniel Wiechmann | Yu Qiao | Elma Kerz
Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)

In this paper we present our contribution to the TSAR-2022 Shared Task on Lexical Simplification of the EMNLP 2022 Workshop on Text Simplification, Accessibility, and Readability. Our approach builds on and extends the unsupervised lexical simplification system with pretrained encoders (LSBert) system introduced in Qiang et al. (2020) in the following ways: For the subtask of simplification candidate selection, it utilizes a RoBERTa transformer language model and expands the size of the generated candidate list. For subsequent substitution ranking, it introduces a new feature weighting scheme and adopts a candidate filtering method based on textual entailment to maximize semantic similarity between the target word and its simplification. Our best-performing system improves LSBert by 5.9% accuracy and achieves second place out of 33 ranked solutions.

Co-authors

Sourabh Zanwar 1

Venues

Fix author