Roland Mathis


2024

pdf bib
Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems
Ivan Sekulic | Silvia Terragni | Victor Guimarães | Nghia Khau | Bruna Guedes | Modestas Filipavicius | Andre Ferreira Manso | Roland Mathis
Proceedings of the 1st Workshop on Simulating Conversational Intelligence in Chat (SCI-CHAT 2024)

In the realm of dialogue systems, user simulation techniques have emerged as a game-changer, redefining the evaluation and enhancement of task-oriented dialogue (TOD) systems. These methods are crucial for replicating real user interactions, enabling applications like synthetic data augmentation, error detection, and robust evaluation. However, existing approaches often rely on rigid rule-based methods or on annotated data. This paper introduces DAUS, a Domain-Aware User Simulator. Leveraging large language models, we fine-tune DAUS on real examples of task-oriented dialogues. Results on two relevant benchmarks showcase significant improvements in terms of user goal fulfillment. Notably, we have observed that fine-tuning enhances the simulator’s coherence with user goals, effectively mitigating hallucinations—a major source of inconsistencies in simulator responses.

2022

pdf bib
BETOLD: A Task-Oriented Dialog Dataset for Breakdown Detection
Silvia Terragni | Bruna Guedes | Andre Manso | Modestas Filipavicius | Nghia Khau | Roland Mathis
Proceedings of the Second Workshop on When Creative AI Meets Conversational AI

Task-Oriented Dialog (TOD) systems often suffer from dialog breakdowns - situations in which users cannot or do not want to proceed with the conversation. Ideally TOD systems should be able to detect dialog breakdowns to prevent users from quitting a conversation and to encourage them to interact with the system again. In this paper, we present BETOLD, a privacy-preserving dataset for breakdown detection. The dataset consists of user and system turns represented by intents and entity annotations, derived from NLU and NLG dialog manager components. We also propose an attention-based model that detects potential breakdowns using these annotations, instead of the utterances’ text. This approach achieves a comparable performance to the corresponding utterance-only model, while ensuring data privacy.