Tatiana Passali


2022

pdf bib
LARD: Large-scale Artificial Disfluency Generation
Tatiana Passali | Thanassis Mavropoulos | Grigorios Tsoumakas | Georgios Meditskos | Stefanos Vrochidis
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Disfluency detection is a critical task in real-time dialogue systems. However, despite its importance, it remains a relatively unexplored field, mainly due to the lack of appropriate datasets. At the same time, existing datasets suffer from various issues, including class imbalance issues, which can significantly affect the performance of the model on rare classes, as it is demonstrated in this paper. To this end, we propose LARD, a method for generating complex and realistic artificial disfluencies with little effort. The proposed method can handle three of the most common types of disfluencies: repetitions, replacements, and restarts. In addition, we release a new large-scale dataset with disfluencies that can be used on four different tasks: disfluency detection, classification, extraction, and correction. Experimental results on the LARD dataset demonstrate that the data produced by the proposed method can be effectively used for detecting and removing disfluencies, while also addressing limitations of existing datasets.

2021

pdf bib
Towards Human-Centered Summarization: A Case Study on Financial News
Tatiana Passali | Alexios Gidiotis | Efstathios Chatzikyriakidis | Grigorios Tsoumakas
Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing

Recent Deep Learning (DL) summarization models greatly outperform traditional summarization methodologies, generating high-quality summaries. Despite their success, there are still important open issues, such as the limited engagement and trust of users in the whole process. In order to overcome these issues, we reconsider the task of summarization from a human-centered perspective. We propose to integrate a user interface with an underlying DL model, instead of tackling summarization as an isolated task from the end user. We present a novel system, where the user can actively participate in the whole summarization process. We also enable the user to gather insights into the causative factors that drive the model’s behavior, exploiting the self-attention mechanism. We focus on the financial domain, in order to demonstrate the efficiency of generic DL models for domain-specific applications. Our work takes a first step towards a model-interface co-design approach, where DL models evolve along user needs, paving the way towards human-computer text summarization interfaces.