Sandra Mitrović

Also published as: Sandra Mitrovic

2024

Hospital discharge letters are a fundamental component of patient management, as they provide the crucial information needed for patient post-hospital care. However their creation is very demanding and resource intensive, as it requires consultation of several reports documenting the patient’s journey throughout their hospital stay. Given the increasing pressures on doctor’s time, tools that can draft a reasonable discharge summary, to be then reviewed and finalized by the experts, would be welcome. In this paper we present a comparative study exploring the possibility of automatic generation of discharge summaries within the context of an hospital in an Italian-speaking region and discuss quantitative and qualitative results. Despite some shortcomings, the obtained results show that a generic generative system such as ChatGPT is capable of producing discharge summaries which are relatively close to the human generated ones, even in Italian.

pdf bib abs
Comparing panic and anxiety on a dataset collected from social media
Sandra Mitrović | Oscar William Lithgow-Serrano | Carlo Schillaci
Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)

The recognition of mental health’s crucial significance has led to a growing interest in utilizing social media text data in current research trends. However, there remains a significant gap in the study of panic and anxiety on these platforms, despite their high prevalence and severe impact. In this paper, we address this gap by presenting a dataset consisting of 1,930 user posts from Quora and Reddit specifically focusing on panic and anxiety. Through a combination of lexical analysis, emotion detection, and writer attitude assessment, we explore the unique characteristics of each condition. To gain deeper insights, we employ a mental health-specific transformer model and a large language model for qualitative analysis. Our findings not only contribute to the understanding digital discourse on anxiety and panic but also provide valuable resources for the broader research community. We make our dataset, methodologies, and code available to advance understanding and facilitate future studies.

pdf bib
Detecting ChatGPT-Generated Text with GZIP-KNN: A No-Training, Low-Resource Approach
Matthias Berchtold | Sandra Mitrovic | Davide Andreoletti | Daniele Puccinelli | Omran Ayoub
Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024)

pdf bib abs
BUST: Benchmark for the evaluation of detectors of LLM-Generated Text
Joseph Cornelius | Oscar Lithgow-Serrano | Sandra Mitrovic | Ljiljana Dolamic | Fabio Rinaldi
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

We introduce BUST, a comprehensive benchmark designed to evaluate detectors of texts generated by instruction-tuned large language models (LLMs). Unlike previous benchmarks, our focus lies on evaluating the performance of detector systems, acknowledging the inevitable influence of the underlying tasks and different LLM generators. Our benchmark dataset consists of 25K texts from humans and 7 LLMs responding to instructions across 10 tasks from 3 diverse sources. Using the benchmark, we evaluated 5 detectors and found substantial performance variance across tasks. A meta-analysis of the dataset characteristics was conducted to guide the examination of detector performance. The dataset was analyzed using diverse metrics assessing linguistic features like fluency and coherence, readability scores, and writer attitudes, such as emotions, convincingness, and persuasiveness. Features impacting detector performance were investigated with surrogate models, revealing emotional content in texts enhanced some detectors, yet the most effective detector demonstrated consistent performance, irrespective of writer’s attitudes and text styles. Our approach focused on investigating relationships between the detectors’ performance and two key factors: text characteristics and LLM generators. We believe BUST will provide valuable insights into selecting detectors tailored to specific text styles and tasks and facilitate a more practical and in-depth investigation of detection systems for LLM-generated text.

pdf bib
NLP in support of Pharmacovigilance
Fabio Rinaldi | Lorenzo Ruinelli | Roberta Noseda | Oscar William Lithgow Serrano | Sandra Mitrovic
Proceedings of the 9th edition of the Swiss Text Analytics Conference

pdf bib
Presenting BUST - A benchmark for the evaluation of system detectors of LLM-Generated Text
Joseph Cornelius | Oscar William Lithgow Serrano | Sandra Mitrović | Ljiljana Dolamic | Fabio Rinaldi
Proceedings of the 9th edition of the Swiss Text Analytics Conference

pdf bib
What can we discover about panic and anxiety from bloggers in Quora and Reddit?
Sandra Mitrović | Oscar William Lithgow Serrano
Proceedings of the 9th edition of the Swiss Text Analytics Conference

2020

pdf bib abs
SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in BERT-based Embedding Spaces
Vani Kanjirangat | Sandra Mitrovic | Alessandro Antonucci | Fabio Rinaldi
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Lexical semantic change detection (also known as semantic shift tracing) is a task of identifying words that have changed their meaning over time. Unsupervised semantic shift tracing, focal point of SemEval2020, is particularly challenging. Given the unsupervised setup, in this work, we propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings. As such, disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages. To leverage this idea, clustering is performed on contextualized (BERT-based) embeddings of word occurrences. The obtained results show that our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.